Tracing cosmic evolution with clusters of galaxies 
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The most successful cosmological models to date envision structure formation as a hierarchical 
process in which gravity is constantly drawing lumps of matter together to form increasingly larger 
structures. Clusters of galaxies currently sit atop this hierarchy as the largest objects that have 
had time to collapse under the influence of their own gravity. Thus, their appearance on the 
cosmic scene is also relatively recent. Two features of clusters make them uniquely useful tracers 
of cosmic evolution. First, clusters are the biggest things whose masses we can reliably measure 
because they are the largest objects to have undergone gravitational relaxation and entered into 
virial equilibrium. Mass measurements of nearby clusters can therefore be used to determine the 
amount of structure in the universe on scales of 10^*-10^^ Mq, and comparisons of the present-day 
cluster mass distribution with the mass distribution at earlier times can be used to measure the rate 
of structure formation, placing important constraints on cosmological models. Second, clusters 
are essentially "closed boxes" that retain all their gaseous matter, despite the enormous energy 
input associated with supernovae and active galactic nuclei, because the gravitational potential 
wells of clusters are so deep. The baryonic component of clusters therefore contains a wealth of 
information about the processes associated with galaxy formation, including the efficiency with 
which baryons are converted into stars and the effects of the resulting feedback processes on galaxy 
formation. This article reviews our theoretical understanding of both the dark-matter component 
and the baryonic component of clusters, providing a context for interpreting the flood of new 
cluster observations that are now arriving from the latest generation of X-ray observatories, large 
optical surveys, and measurements of cluster-induced distortions in the spectrum of the cosmic 
microwave background. 
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I. INTRODUCTION 

Cosmology has recently reached an important mile- 
stone. A wide variety of cosmological observations now 
support a single model for the overall architecture of the 
observable universe and the development of galaxies and 
other structures within it. According to this so-called 
concordance model, the geometry of the observable uni- 
verse is indistinguishable from a flat geometry, implying 
that its total energy density is very close to the critical 
density needed to close the universe. The two dominant 
components of the universe appear to be a non-baryonic 
form of dark matter, whose gravity is responsible for 
structure formation, and a mysterious form of dark en- 
ergy, whose pressure is currently causing the expansion 
of the universe to accelerate. The mean density of bary- 
onic matter is about 15% of the total amount of matter, 
and we can observe the baryonic matter only because the 
gravitational attraction of non-baryonic dark matter has 
drawn the baryonic gas into deep potential wells, where 
a small fraction of it condenses into stars and galaxies. 

This model explains many different features of the ob- 
servable universe, but it is not entirely satisfying because 
the nature of the dark matter and the provenance of the 
dark energy remain unknown. The implications of dark 
energy for fundamental physics are particularly serious, 
so we need to be sure that it is absolutely necessary to ex- 
plain the astronomical observations. In addition, many 
aspects of galaxy formation remain poorly understood. 
Dark-matter models successfully account for the spatial 
distribution of mass in the universe, as traced by the 
galaxies, but they do not explain all the properties of 
the galaxies themselves. Dark matter initiates the pro- 
cess of galaxy formation, but once stars begin to form, 
supernova explosions and disturbances wrought by su- 
permassive black holes can inhibit further star formation 
by pumping thermal energy into the universe's baryonic 
gas. 

Clusters of galaxies are a particularly rich source of 
information about the underlying cosmological model, 
making possible a number of critical tests. According 
to the concordance model, clusters are the largest and 
most recent gravitationally-relaxed objects to form be- 
cause structure grows hierarchically. The universe begins 
in a state of rapid expansion whose current manifesta- 
tion is Hubble's Law relating a galaxy's recessional ve- 
locity Vr to its distance d through Hubble's constant Hq: 
Vr = Hod. Generalizing this feature of the local universe 
to all of observable space links an object's cosmological 
redshift z = (Aobscrvcd/Arcst) — 1 with a unique time t{z) 
since the Big Bang, enabling us to probe the evolution of 
the universe with observations of distant objects.^ Grav- 



In this definition, Arcst is the wavelength of a photon emitted by 
a distant object and Aobscrvcd is the wavelength it is observed to 
have when it reaches Earth. 



ity drives structure formation in this expanding realm 
because the matter density is nearly equal to the critical 
density during much of cosmic history. Regions whose 
density slightly exceeds the mean density are therefore 
gravitationally bound and eventually decouple from the 
expansion, collapse upon themselves, and enter a state of 
virial equilibrium in which the mean speeds of the com- 
ponent particles are approximately half the escape ve- 
locity. Because density perturbations in the concordance 
model have greater amplitudes on smaller length scales, 
small sub-galactic objects are the first to decouple, col- 
lapse, and virialize. These small objects then collect into 
galaxies, and galaxies later collect into clusters of galax- 
ies, whose masses now top out at roughly 10^^ times that 
of the Sun's (10^^ M©). Thus, the growth and develop- 
ment of clusters directly traces the process of structure 
formation in the universe. 

Section |nl outlines the observable properties of galaxy 
clusters that enable us to measure their masses. Observ- 
ables in the optical band include the overall luminosity of 
a cluster's galaxies, which scales with the overall mass, 
the velocity dispersion of a cluster's member galaxies, 
which responds to the depth of the cluster's potential 
well, and gravitational lensing of background galaxies by 
the cluster's potential. Observables in the X-ray band 
include the overall X-ray luminosity of a cluster, coming 
from the hot gas trapped in the cluster's gravitational 
potential, the temperature inferred from the X-ray spec- 
trum of that gas, and the abundances of various elements 
inferred from the emission lines in that spectrum. This 
hot gas also leaves an imprint on the microwave sky be- 
cause its electrons Compton scatter the photons of the 
cosmic microwave background radiation. Microwave ob- 
servations are therefore an alternative source of informa- 
tion about the hot gas and its temperature. 

Once we have measured the masses of a sample of clus- 
ters, we can use that sample to study cosmology. Sec- 
tion IIIII explains how the characteristics of the cluster 
population relate to cosmological models. It begins by 
summarizing the elements of the concordance model and 
provides a number of useful analytical approximations to 
the results of numerical simulations of cluster formation 
based on this model. Then it covers the dicey middle 
ground linking those simulations with observations, cur- 
rently the main source of uncertainty in deriving cosmo- 
logical parameters from cluster observations. The sec- 
tion concludes with a look at the evolution observed in 
the cluster population and the constraints that cluster 
evolution places on cosmological models. 

Section Hvl takes up the subject of the baryonic com- 
ponent of clusters, with two purposes in mind. First, in 
order to improve the precision of cosmological measure- 
ments with clusters, we need to know how the process 
of galaxy formation affects the relations used to derive 
cluster masses from observations of a cluster's hot gas 
and galaxies. Current numerical simulations accurately 
reproduce the behavior of the dark component, whose in- 
teractions are purely gravitational, but fail to reproduce 
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with similar accuracy the observed behavior of the bary- 
onic component, whose interactions are also hydrody- 
namical and thermodynamical. These discrepancies be- 
tween simulations and observations indicate that galaxy 
formation alters the state of a cluster's hot gas in a way 
that preserves information about the poorly understood 
feedback processes that regulated galaxy formation long 
before the cluster reached its present state. Our second 
purpose is therefore to try to decipher what the state of 
the hot gas is saying about the process of galaxy forma- 
tion, so as to gain insight into those feedback processes. 
Section Ivl concludes the review with some brief remarks 
about ongoing and future cluster surveys. 

Despite this article's length, it falls somewhat short of 
being a comprehensive review of cluster physics, which 
would require more pages than this journal is inclined to 
provide. Instead, I have tried to assemble a readable in- 
troduction to cluster evolution for non-experts, concen- 
trating on the middle ground connecting theory to ob- 
servations and distilling the key theoretical results into 
a set of simple analytical tools useful to observers. For 
more on the subje ct of c lusters and their evoluti o n, read - 
ers should co i isultlSaraz in ( IQSS il.lBorgani ali lf2002b|) . 
iRosati et ai\ l|2002|) . and .Mulchaev et al\ 1)2004 - 



II. OBSERVABLE PROPERTIES OF CLUSTERS 

Clusters of galaxies might have been called something 
different if they had first been discovered in a waveband 
other than visible light, because all of the stars in all 
of a cluster's galaxies represent only a small fraction of 
a cluster's overall mass. Clusters contain substantially 
more mass in the form of hot gas, observable with X- 
ray and microwave instruments. This section outlines 
how clusters are observed in all three of these wavebands 
and how those observations reveal a cluster's total mass, 
which turns out to be about seven ti mes the combined 
baryonic ma ss in stars and hot gas llAUen et al\. l2002t 
iDavid p.t all ll 99fit lEvrardL ll 997t IWhite p.t all ll c 



A. Clusters in Optical Light 

Optical identification of galaxy clusters has been going 
on for quite a long time. By the end of th e eighteen t h cen- 
tury Charles lMessie3 l)l784|) and William iHerschell l)l785)) 
had already recognized concentrations of galaxies in the 
constellations Virgo and Coma Berenices. Today these 
clusters of galaxies are known as the Virgo cluster and 
the Coma cluster. Optical discoveries of clusters contin- 
ued to accumulate as observing power grew over the next 
two centuries (sec Biviano, 2000, for a review of the his- 
tory), culminating with the def initive cluster catalogs of 
Geor ge Abell and collaborators l|Abel]|.[l953lAbell etaH 
Il989() . Abell's catalogs contain most of the known nearby 
galaxy clusters and are the foundation for much of our 
modern understanding of clusters. 



Abell recognized that projection effects can complicate 
the identification of clusters in optical galaxy surveys 
and therefore was careful in defining his clusters. Work- 
ing from the Palomar Sky Survey plates he estimated 
the distance of each cluster candidate from the apparent 
brightness of its tenth brightest member galaxy. He then 
counted all the galaxies lying within a fixed projected ra- 
dius and brighter than a magnitude limit two magnitudes 
fainter than the third brightest member. The bounding 
radius, which he determined from the distance estimate, 
is now known to be ~ 2 Mpc and was the same for all 
clusters.^ In order to compensate for projection effects, 
he subtracted from his galaxy counts a background level 
equivalent to the mean number of galaxies brighter than 
the magnitude limit for the cluster in similarly-sized, 
cluster-free regions of the plate, and retained all clus- 
ter candidates with a net excess of 50 galaxies brighter 
than the limiting magnitude. 

Most of the optical cluster identification tech- 
niques used today extend and refine Abell's ba s ic ap- 
proac h (e.g.. iDalton ali Il997l : iLumsden et all Il992t 
Post man et al\ . Il996)) . often aug menting it with infor- 
mation about galaxy co l ors (e.g.. |Ba rcall et aJ\ . l2003bt 
iGladders and Yeel l200ft iNichoi I2004D . These improve- 
ments are necessary because the contrast of clusters 
against the background galaxy counts decreases with 
cluster distance. Galaxy colors can help identify distant 
clusters because many cluster galaxies are significantly 
redder than other galaxies at a similar redshift, owing 
to their lack of ongoing star formation. The colors of 
their aging stellar populations therefore place these clus- 
ter members on a narrow and distinctive locus known 
as the "red seq uence" in a plot of gala xy color versus 
magnitude (e.g. iGladders and YeellSoOOf) . 

Once suitable cluster candidates are found, their status 
as true mass concentrations can be checked by measuring 
the underlying mass. Optical observations offer two com- 
plementary ways to perform such measurements, through 
the orbital velocities of the member galaxies and through 
the degree to which galaxies lying behind the cluster are 
lensed by the cluster's gravitational potential. We will 
discuss both of these methods after a few more words 
about how galaxy counts relate to the overall optical lu- 
minosities of clusters. 



1. Optical Richness 

To the extent that light traces mass in the universe, 
the total optical luminosity of a cluster is itself an in- 
dicator of a cluster's mass. Measuring the luminosity of 
every galaxy in a cluster is impractical, especially for dis- 
tant clusters in which only the brightest galaxies can be 



^ The Megaparsec is astronomers' favored unit of distance on clus- 
ter scales: 1 Mpc = 3.09 X 10^'' cm = 3.26 X 10® light years. 



4 



observed. However, because the luminosity distribution 
function of cluster galaxies is nearly the same from clus- 
ter to cluster, observing the high-luminosity tip of that 
distribution allows one to normalize the overall galaxy 
luminosity function for the cluster, yielding estimates for 
both the cluster's total optical luminosity and its mass. 

Abell's catalogs encode this information by placing 
clusters in categories of "richness" corresponding to the 
net excess of galaxies brighter than the magnitude limit 
used to define each cluster. The richest clusters (class 
5) contain over 300 galaxies brighter than the magni- 
tude hmit, while the poorest (class 1) contain only 50-79 
such galaxies. Clusters not quite making Abell's cut (30- 
49 galaxies above the magnitude limit) were assigned to 
richness class zero. Within this system, the Coma cluster 
originally ranked as richness class 2. 

Invoking assumptions about the shape of the lumi- 
nosity distribution function helps to link richness more 
directly to a cluster's total luminosity. Cluster galax- 
ies generally adhere to a luminos ity distribu t ion fu nction 
following the form proposed by ISchechteil l)l976j) , with 
the number of galaxies in luminosity range dL about 
L proportional to L ~" exp(— L/L,,), with a « 1 (e.g.. 



iBalogh et a^J.l2001a ). Assuming this distribution func- 
tion, iPostman et al. I (|l996) define a richness parameter 



Aci equivalent to the number of cluster galaxies brighter 
than the characteristic luminosity L*. They find that 
Aci is highly correlated with Abell's richness measure, 
but the scatter between richness and Ad is large. 

Another richness parameter in current use is i?cg, the 
amplitude of the correlation functi on between the cluster 
center an d the member g alaxies l)Longair and Seldnerl 
IT979.: YeeTnd Lopez-Cruzl Il999^ . It is derived from the 
angular correlation function of galaxies measured down 
to a given magnitude limit, after removing the back- 
ground counts, and is normalized by dividing out the 
expected luminosity distribution function of galaxies in- 
tegrated down to that magnitude limit. This richness pa- 
rameter also correla tes with Abell's richness , but again 
the scatter is broad. lYee and Ellingso'nl l)2003fl show that 
Beg correlates well with other global properties of clus- 
ters, suggesting that richness observations may become 
an inexpensive way to measure cluster masses, but first 
the mass-richness relation must be calibrated and the 
scatter in that relation must be quantified. 



2. Galaxy Velocities 

Once a cluster has been optically identified, obtain- 
ing the radial velocities Vr of the cluster galaxies from 
their redshifts helps in mitigating projection effects and 
in measuring the cluster's mass. Because the velocity dis- 
tribution of a relaxed cluster's galaxies is expected to be 
gaussian in velocity space, galaxies with velocities falling 
well outside the best-fitting gaussian envelope are un- 
likely to be cluster members and are generally discarded. 
Fitting the velocity distribution exp[— (u^ 



to the remaining galaxies then yields a one-dimensional 
velocity dispersion am for the cluster. If the velocity 
distribution of a cluster candidate is far from gaussian, 
then it is probably not a real cluster but rather a chance 
superposition of smaller structures. Obviously, the accu- 
racy of CTiD depends critically on the number of galaxies 
with measured velocities and the method for identifying 
a nd eliminating non-m embers. 

IZwickvl 1I19331 119871) was the first to measure a clus- 
ter's velocity dispersion, finding ctid ~ 700 km s~^ for the 
Coma cluster. He correctly concluded from this fact and 
his estimate of the Coma cluster's overall radius that this 
cluster's mass must be far greater than the observed mass 
in stars — the first eviden ce for dark m atter in the uni- 
verse. Shortly thereafter, ISmithI l)l936 ^ showed that the 
same was true of the Virgo cluster. Zwicky's reasoning 
involved the virial theorem of classical mechanics, which 
applies to steady, gravitationally bound systems. Differ- 
entiating the system's moment of inertia / = m^r^ 
twice with respect to time and setting the result to zero 
produces the virial relation 



E 



E 



(1) 



The left-hand side is twice the total kinetic energy of the 
cluster's particles, and in a spherically symmetric system 
of mass M with a gaussian velocity distribution, that 
kinetic energy is 3M(t^j-,/2. If the system is isolated, then 
the right-hand side is equal to the absolute value of the 
gravitational potential energy, which can be expressed as 
GM^/rc, where 



re 



^^MEE 



mirrij 



(2) 



i i<j 



miirLj 
r±,i 



(«.»V2^?d] 



and Tij is the separation between particles i and j. The 
approximation gives re for a spherically symmetric sys- 
tem in terms of the pr ojected particle separa t ions r \ in 
the plane of the sky ijLimber and Mathewsl . Il960|) . Ac- 
cording to the virial theorem, the mass of a spherical, 
isolated cluster should therefore be M = ^afj^rc/G. 

Applying this virial analysis to real clusters is not quite 
so simple because clusters are not isolated systems — 
there is no clean boundary separating a cluster from the 
rest of the universe. Segregating the cluster from the 
outlying regions with an arbitrary bounding surface al- 
ters the interpretation of the right-hand side of equation 
([T|). In a steady state, the momentum fiux of particles 
exiting the boundary is equal to that entering, so the 
bounding surface is formally equivalent to a reflecting 
wall that adds a pressure correction term , offsetting some 
of the gravitational potent ial energy l)Carlberg et all 
ll997aHThe and Whitelll986^ . One must also account for 
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objects seen in projection, such as infalling galaxies that 
have not yet entered into virial equihbrium and interlop- 
ers that are not true cluster members, problems that have 
led to the inventi on of various kinds of projected mass 
estimators (iBahcall and Tremainel Il98lt iHeisler et all 

Extensive redshift measurements now allow observers 
to measure much more than just a cluster's veloc- 
ity dispersion, enabling detailed studies of a cluster's 
mass profile and dynamical state. Generally a clus- 
ter's velocity dispersion declines with projected ra- 
dius, implying that the relationship between projected 
radius and the mass enclosed within that radius is 
somew hat shallower t h an line a r in the cluster ' s out- 
skirts ICa rlberg et all Il997al: iKent and GunnL Il982t 
iRood etaL . il972^"l . Beyond the approximate virial ra- 
dius of a cluster, the enclosed mass continues to increase 
and the galaxies move p rimarily on infalling radial trajec- 
tories (iBiviario and GirardL .2003: Diaferio and G cUcr. 
Il997t iKaised. 119871: iRegos and Gelleii 119891) . Even far- 
ther out is a thin region where galaxies are nearly station- 
ary with respect to the cluster because there the cluster's 
gravity has just succee ded in reducing the outward Hub - 
ble flow to a standstill l)Kaiseii Il987t iRines et aZl . l2003(l . 
Eventually these galaxies will fall back toward the cluster 
and become cluster members. 

Because clusters are dynamical systems that have not 
quite finished forming and equilibrating, the velocity dis- 
persion and virial theorem by themselves do not yield an 
exact cluster mass measurement. Detailed information 
on the spatial distribution of galaxy velocities is of great 
help in measuring the masses of large, nearby clusters 
but similar information is very difficult to obtain for the 
distant clusters so interesting to cosmologists. In lieu of 
detailed observations, one can use simulations of cluster 
formation to calibrate the approximate virial relationship 
between velocity dispersion and cluster mass, but we will 
postpone discussion of that procedure to the discussion 
of dark-matter dynamics in Sec. lIIII 



3. Gravitational Lensing 

In his remarkable 1937 paper on the Coma cluster, 
Zwicky also proposed that cluster masses could be mea- 
sured through gravitational lensing of background galax- 
ies. That technique did not become practical for six more 
decades but is now one of the primary methods for mea- 
suring cluster mass. Lensing is sensitive to the cluster's 
mass within a given projected radius rj_ because the mass 
within this radius deflects photons toward our line of 
sight through the cluster's center. When the deflection 
angle is small compared to a background galaxy's angular 
distance from the cluster center, weak lensing shifts each 
point in the galaxy's image to a slightly larger angular 
distance from the cluster's center, thereby distorting the 
image by stretching it tangentially to r± . Measuring the 
weak-lensing distortion of any single galaxy is nearly im- 



possible because the exact shape of the unlensed galaxy 
is generally unknown. Instead, observers must measure 
the shear distortion of an entire field of background galax- 
ies, under the assumption that any intrinsic deviations of 
galaxy images from circular symmetry are uncorrelated. 

Many excellent arti cles explain this weak-lensing tech- 
nique in more detail (Bartelmann and Schneider, "20011 
Hoekstra et all J998: Ka i ser and Squir es. 1993; McUiea 
1999t iTvson et allil99(t\ . Here we wish only to give a 
fiavor of how a cluster's mass can be measured from the 
lensing it induces. The deflection angle itself depends on 
the gradient of the gravitational potential in the lensing 
system, meaning that a mass sheet of constant surface 
density produces no shear and goes undetected. Ad- 
ditional mass that is distributed symmetrically about 
the line of sight through a cluster's center bends pho- 
ton paths by an angle twice that expected from Newto- 
nian physics, 4:GM{< r±)/c'^r±, which can be measured 
from the shear distortion and redshift distribution of the 
background galaxies. Obtaining a cluster mass from the 
mass M(< r±) along the column bounded by r±_ requires 
additional assumptions about how mass is distributed 
within this column. A particularly simple mass config- 
uration would be a singular isothermal sphere, in which 
CTiD remains constant with radius and M{r) = 2afj^r/G 
(Sec. IIII.B.2P : notice that the boundary pressure term 
required in this configuration alters the usual virial re- 
lation. The deflection angle for this mass distribution is 
Aira'^jy/c^ , independent of radius. In general, however, 
the cluster potential will not be precisely isothermal, nor 
will the cluster be perfectly spherical. 

Simulations of large-scale structure formation suggest 
that superpositions of other mass concentrations limit 
the accuracy of weak-lensing masses, at least for clus- 
ters defined to be within spherical volumes. Projected 
mass fluctuations along the line of sight to a distant clus- 
ter can be on the o r der o f ~ lO" Mq l|Hoekstral l200lt 
iMetzler et d\ . l200ll Il999|) . On the other hand, weak- 
lensing masses are expected to correlate quite well with 
cluster richness, another measure of the mass within a 
cylindrical region, raising the possibility that at least 
some of the projected mass can be accounted for by using 
galaxy colors to separate these mass concentrations from 
the cluster in redshift space. 



B. Clusters in X-rays 

Clusters of galaxies are X-ray sources because galaxy 
formation is inefficient. Only about a tenth of the uni- 
verse's baryons reside with stars in galaxies, leaving the 
vast majority adrift in intergalactic space. Most of these 
intergalactic baryons are extremely difficult to observe, 
but the deep potential wells of galaxy clusters compress 
the associated baryonic gas and heat it to X-ray emit- 
ting temperatures. The gas temperature inferred from a 
cluster's X-ray spectrum therefore indicates the depth of 
a cluster's potential well, and the emission-line strengths 
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in that spectrum indicate the abundances of elements 
hke iron, oxygen, and silicon in the intracluster medium 
(ICM). Here we outline the primary characteristics of 
that X-ray emissi on. F o r a m ore detailed discussion of 
the physics, see iSarazinI l)l988|) . 



1. X-ray Surface Brightness 

Extended X-ray emission from clusters of galaxies was 
first observed in the early 1970's ( Fprn^ an et ai, 1 97^ 
iGurskv et dl Il97ll: iKellogg et d\ . 1x9721) . but was cor- 
rectly attr ibuted to thermal br emsstrahlung several years 
earlier by iFelten et all l)l966|) , who were inspired by a 
spurious X-ray detection of the Coma cluster. For typ- 
ical cluster temperatures {kT > 2keV) the emissivity 
of thermal bremsstrahlung dominates that from emis- 
sion lines, but below ~ 2keV that situation reverses, 
given the typical heavy-element abundances relative to 
hydrogen which are ~0.3 times those found in the Sun. 
The rate at which the ICM radiates energy can be ex- 
pressed in terms of a cooling function Ac(T) computed 
assuming that coUisional ionization equilibrium deter- 
mines the relative abundance of each ion. Many col- 
lisional ionization codes have been developed to com- 
pute the emissivity and X-r ay spectrum of such gas (e.g., 
iRavmond and Smithl . Il977|) . Because these cooling pro- 
cesses all involve electrons colliding with ions, the re- 
sulting cooling function is usually defined so that cither 
Uenn AciT) or nenionAc(T ) is the luminosity per unit vol- 
ume. iTozzi and give a useful fit to the 
computations of lSutherland and Dopital l)l993f l for abun- 
dances equal to 0.3 times their solar values. For typical 
ICM temperatures, Ac ~ 10"^'^ ergcm'^ s~^. 

In most clusters, the intracluster gas appears to be in 
approximate hydrostatic equilibrium. Assuming spher- 
ical symmetry, the equation of hydrostatic equilibrium 
can be written 

d\npg ^ d\nT ^ ^0(0 ^3-) 
d In r d In r T ' 

where Pg is the gas density and k-eT^ir) — 
GM{r)iimp/2r is the characteristic temperature of a 
singular isothermal sphere with the same value of 
M{r)/r. Making the additional assumption that the 
gas is isothermal leads to a classic model for the X-ray 
surface brightness of clusters known as the beta model 
fCavaliere and Fusco-Fcmiano, 1976). If the velocity dis- 
tribution of the particles responsible for M{r) is also 
isothermal with a constant velocity dispersion CTid, then 
Poisson's equation implies 

d In Pg pmp d(j) ^dlnp 
dr kT dr dr 

where the eponymous /3 = fimpafY) / kT (e.g., ISarazinl 
[198^. Give n the approximate isothermal potential of 
iKind l)l962(l . p{r) (x [1 + (r/rc)^]"^/^, in which 



is a core radius that keeps the profile from becom- 
ing singular at the origin, the gas density profile be- 
comes Pg{r) oc [1 -f {r/rc)'^]~^^/^. The expected X-ray 
surface brightness profile for an isothermal gas is then 
oc [1 -f (r/rc)^]^'^'^+^/^, and fitting this model to the ob- 
servations gives the best- fit parameters rc, /3fit, and the 
normalization of the gas-density distribution. 

Beta models generally describe the observed surface- 
brightness profiles of clusters quite well in the ra- 
dial range from ~ Tc to ~ 3rc, with /3fit ~ 2/3 
and rr ^ O.lrn g i ving t he best fits for rich clusters 
ijJones and Formanl Il984j) and a pos sible trend toward 
lower values in poorer c l usters llFinoguenov et al. , 
20niht iHelsdon and PonmanL hOOd. iHorner et all Il99c : 
Sanderson et aZ.l.l2003(l . The X ray luminosity integrated 
over radius converges for /3 > 0.5, meaning that most of 
the observed X-rays come from a relatively small propor- 
tion of the ICM. However, beta mo dels often underesti- 
mate the central surface brightness ijJones and Formanl 
and tend to over estimate the brightness at r » Tc 
ijVikhhnin et al\ . \l99^ . These discrepancies arise in part 
because the intracluster medium is not strictly isother- 
mal fSec. lIII.C.2ll and because real cluster potentials dif- 
fer from the King model fSec. IIII.B.2|I . 

The centrally concentrated surface-brightness profiles 
of clusters make X-ray surveys very effective at finding 
cluster candidates. Because X-ray emission depends on 
density squared, clusters of galaxies strongly stand out 
against regions of lesser density, minimizing the com- 
plications of projection effects (see iRosati et al\ . l2002l 
for a recent review). Surveys of X- ray selected c l usters 
currently extend to z « 1.3 (e.g. iRosati et all l2004t 
^anford et at, 2003), a limit owing to the decline of 
surface brightness with redshift (Sec. IIII.A.2|I . Unfor- 
tunately, X-ray luminosity correlates less well than one 
would like with the optical properties of clusters. Early 
studies showed that X-ray luminosity cor relates w i th op- 
tical richness but with a large scatter llBahcall Il977t 
lMushotzkvlll984 . and that situation has not improved 
much in the i ntervenin g decades fDonahue et all I2002t 
iGilbank et all . [2003: .Kochanek et aL . .2001 . The opti- 
cal prope rties of very luminous X-ray clusters are well 
behaved ("Lewi s et all Il999() . but deep optical surveys 
have found distant cluster candidates that appear to have 
velocity dispersions much la rger than one wou ld guess 
from their X-ray luminosity ijLubin et all |2004() . These 
objects may be may be superpositions of smaller clus- 
ters whose joint velocity distribution seems like that of a 
larger relaxed cluster. 



2. Plasma Temperature 

Clusters in hydrostatic equilibrium have a plasma tem- 
perature that is closely related to the overall mass. Mea- 
suring that temperature requires higher quality data than 
a simple luminosity measurement, because the photons 
must be divided among multiple energy bins. Ideally, 
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one would like enough data to measure both T(r) and 
Pg{r), in which case equation ||2J) can be solved directly 
for M (r) . Even with the highest-quality data, the de- 
rived mass is still slightly model dependent because T(r) 
and p{r) must be deter mined by depro i ecting the surface- 
brigh tness informatio n llFab ian et oil llQSlU Kriss et~ai] . 
ll98aiPizzolato et oil l200,lHWhite et oi.l.ll997D . 

In practice, the quality of the mass measurement de- 
pends on what the total number of observed X-ray pho- 
tons allows. With limited information about the temper- 
ature gradient, one can fit a polytropic law'^ T oc p^'^"'"^, 
giving the radial dependence of temperature in terms 
of an effective adiabatic index joS with density as the 
radial coordinate. However, data on distant clusters 
often do not allow a temperature gradient to be mea- 
sured and sometimes are even insufficient to give an ac- 
curate temperature. In those cases, one must rely on 
scaling laws that connect X-ray luminosity with temper- 
ature and temperature with mass, calibrated with ei- 
ther high-quality observations or numerical simulations 
of cluster formation that include all the relevant physics 

rsec lmTl . 

Limitations in the measurement of cluster temperature 
systematically affect the mass one infers for the cluster. 
If only a single temperature can be measured, then the 
isothermal beta model implies 

M(r) ^ 3pkBT {rlr^f 

r Giirup 1 + (rjr^Y ' ^ ' 

Note that at large radii this relation approaches the 
one for isothermal gas in a singular isothermal poten- 
tial, M{r)/r = 2kBT/Gfnnp, as long as /3 = 2/3. How- 
ever, single temperatures gleaned from a cluster's over- 
all spectrum need to be treated with caution. Global 
cluster temperatures quoted in the literature are gener- 
ally spectral-fit temperatures (Tsp) obtained by fitting 
a single-temperature emission model to an overall clus- 
ter spectrum containing multiple temperature compo- 
nents. These spectral-fit temperatures are similar to, 
but not identical to, the cluster's luminosity-weighted 
temperature Tium in which each temperature component 
is weighted by Pg. Numerical simulations indicate that 
both Tsp and Tiuni can differ from the mass-weighted 
gas temperature Tp and froni one another by y 10- 20% 
l|Mathiesen and EvrardLl200lUMazzotta et al\ . \200^ . 

A modest amount of spatially resolved temperature 
information improves the mass measurement. Allowing 
for a temperature gradient corresponding to T cx pj""^^ 
changes the estimated mass to 

M(r) ^ 3/j7offfcBr(r) jr/r,)^ 
r Gpnip 1 + {r/rcY 



^ Note that this is not an actual equation of state for the gas but 
only a fitting formula for T{r) as a function of Pg{r). 



Observers are still working toward a consen- 

sus on the tempera ture gra dients of clusters 

llDe Gra ndi and Molendi[ |222^ llrwin and Bregmani 
20001: 'Markevitch et al'.. 'l998| iMushotzkvl. |2004 
Pratt and Arnaud. 2002) ■ b ut measured value s of 7pff 
often range as high as 1.2 l)Finoguenov et 

I2001bli . 

Cluster temperatures are extremely difficult to observe in 
the neighborhood of the virial radius, but extrapolating 
a 7off = 1.2 gradient to lOrc leads to a gas temper- 
ature less than half the core temperature. Including 
temperature-gradient information can therefore lower 
the estimated mass for a cluster of temperature Tium by 
up to -50%. 

Despite the potential for systematic uncertainties, 
the luminosity-weighted temperatures of clusters corre- 
late well with their velocity dispersions. Most of the 
recent comparisons for low-redshift clusters find that 
(TiD oc Tgp"'®, shghtly steep er than expected i f both 
quantities track clu ster mass in and Bahcall Il99l 
IXue and Wul . 1200(1 . Those same comparisons find nor- 
malizations of this relation for rich clusters in the range 
/3sp = pnipal^/kT^p = 0.9 - 1.0 (Figure HI). The dis- 
crepancy between /3sp and /3gt is no cause for concern. It 
arises because the true mass profile is not a King model 
and bec a.use clusters are not in perfect hydrostat ic equi- 
librium ijBahcall and LubinL 11994 lEvrardl ll990D . More 
worrisome are recent observations suggesting that the 
X-ray temperatures of distant optically-selected clusters 
with unusually small X-ray luminosities are also consider- 
ably cooler than the ir velocity dispersions would indicate 
ijLubin et alV l2004(l . However, more extensive redshift 
measurements have shown that at least one of these sys- 
tems is composed of several smaller syste ms that have not 
yet m erged to form a single large cluster ijGal and LubinL 
I2004D . 



3. Measuring Abundances 

Abundances of elements like iron, oxygen, and silicon 
in the intracluster medium are relatively easy to measure 
from their emission line fluxes, as long as the tempera- 
ture of the line-emitting gas is well defined. Because of 
the low density of intracluster gas, coUisional deexcita- 
tion is negligible, so every coUisional excitation produces 
a photon that leaves the cluster. Thus, one can fit the 
optically-thin spectrum of a coUisionally-ionized, single- 
temperature plasma to the observed spectrum, adjusting 
the abundances in the model to produce the best fit. The 
high spectral resolution of today's X-ray observatories, 
Chandra and XMM-Newton, allows abundance determi- 
nations for individual elements if enough photons can be 
gathered. Otherwise, the solar pattern of abundance ra- 
tios is assumed for elements other than H and He and the 
normalization of the overall pattern is fit to the observa- 
tions. Because the most abundant elements are almost 
completely ionized in the hottest clusters, these abun- 
dance determinations depend heavily on the strength of 
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FIG. 1 Relation between velocity dispersion and temperature for a heterogeneous sample drawn from the literature. Solid 
squares illustrate data on galaxy groups and open circles give the cluster data. The dotted and solid lines show the 
best power-law fits for groups and clusters, respectively. The best-fitting relation to the combined sample is aio = 
^q2.51±o.oi kms-i(r/lkeV)° ''^±° °S corresponding to /3sp = 0.97 at 6 keV. (Figure from lXue and'Wull2000l) 



the K-shell emission lines of iron, sometimes the only 
hncs that are measurable. 

On average, the overall abundances of heavy elements 
with respect to hydrogen in clusters are about 0.3 times 
the solar ratios. Just as with temperature, this determi- 
nation is weighted toward the cluster core because of the 
emissivity. Spatially resolved observations of Fe in- 
line emission show that iron abundances, at least, can 
be higher at the cluster's center, particularly when a 
giant, central-dominant galaxy is there. This iron ex- 
cess is consistent with being supernova debris from the 
giant galaxy's stars (De Grandi et ai, 2003). Farther 
out in clusters, these Fe gradients appear to flatten at 
~0.3 times the solar level, extending to about ~ 5rc, 
beyond which point the X-ray surface brightness is too 
low for accurate abundance and temperature measure- 
ments. This abundance level does not seem to have sub- 



stantially changed f rom redshift ~ 1 to the present 
lOonahue 670/111998. .1999; TozzTlTall. [2003.) . 

The total amount of iron implied by extrapolating this 
ratio over an entire cluster is quite impressive, exceeding 
the total amount of ir on contained w ithin all the stars in 
the cluster's galaxies (,.Renzinil. ll997j) . Explaining how all 
that iron got into the intracluster medium is challenging. 
It is comparable to the total amount of iron produced by 
all the supernovae thought to have exploded during the 
history of the cluster, and according to some estimates, 
it requires a disproportionately large number of massive 
stars t o have formed in order to produce enough super- 
novae fDavid et all '1991'; 'Gibson and Matteucci', '1997 ; 
■Loewenstein. 20 01^ ; L oew cnstcin and Mushotzky, 1996 ; 
Matteucci and Gibsonl. lT995HPortinari et alU206% . 

Presumably all these supernovae could have driven 
strong gaseous outflows known as galactic winds that ex- 



9 



pelled the heavy eleme n ts into the intracluster medium 
llHeckman et arson and Dinerstelnl. Il975|) . 

However, such powerful galactic winds are hard to pro- 
duce in numerical simulations of galaxies because much 
of the energy released by massive-star (Type II) super- 
novae is transferred to cool gas within the galaxy, where 
it is radiated away before it manages to drive a pow- 
erful wind (,Mac Lo w and Fcrrara, 1999) . Alternatively, 
some of this iron may come from exploding white dwarfs 
(Type la supernovae), whose iron yields are higher than 
those of Type II supernovae. In cither case, the to- 
tal amount of kinetic energy released by the supernovae 
that created these elements is enormous, corresponding 
to ^ 0.3 — 1 keV per particle in the intraclust er medium 
(|Finoguenov et oij . 1200 17: 'Pipi no ef^ . 12002! ). Yet, the 
efficiency of energy transfe r from supernovae to the IC M 
remains an open question l)Kravtsov and Yepesll2000(l . 

In principle, one can probe the origins of elements in 
the ICM and assess whether massive stars were dispro- 
portionately common earlier in time by comparing the 
abundances of massive-star products like oxygen to that 
of iron, which may come largely from Type la supernovae. 
No clear answer has yet emerged from such studies, which 
depend heavily on a proper understanding of the gas tem- 
perature dis tribution to get the correct elemental abun- 
dances fe.g-. lBuote et a^J . l200d) . Some studies have con- 
cluded that the relative abundance patterns in the intr- 
acluster medium are near solar, implying that the stellar 
populations producing those supernovae were similar to 
those in our own galaxy (Rcnzini, 2004) . Other stud- 
ies find an excess of oxygen and other elements of sim- 
ilar atomic number, suggesting that the cluster's galax- 
ies produced an unusually large numb er of massive stars 
early in the cluster's history (e.g., iFinoguenov et al\ . 
120031) . 



C. Clusters in Microwaves 

Hot gas in clusters can also be observed through its 
effects on the cosmic microwave background. The back- 
groun d itself has a virtu ally perfect blackbody spec- 
trum ijMather et al\ . Il990(). Soon afte r the discovery of 
this background radiation, IWevmannl l(r965,. .1966) com- 
puted how Compton scattering would distort its spec- 
trum, slightly shifting some of the microwave photons to 
high er energies as they pass ed through hot intergalactic 
gas. ISunvaev and ZeldovichI l(T970. 1972|) then predicted 
that hot gas in clusters of galaxies would indeed produce 
such a distortion, now known as the Sunyaev-Zeldovich 
(S-Z) effect. 



1. The S-Z Effect 

Two decades after this prediction there were only a 
few marginal detections ijBirkinshawl Il99l[) . but many 
clusters were detected at high significance in the ensuing 



decade l|Birkinshawlll999l:ICarlstrom et a^J . l2000|) . With 
multiple new and highly capable S-Z instruments coming 
on line in the next few years, another quantum leap in 
this area is poised to happen, enabling wide-field cosmo- 
logical studies of clu sters to extend through much of the 
observable universe ijCarlstrom et all l2002|) . A number 
of recent reviews elucidate the details of the S-Z effect 
(e.g.. Birkiu shaw, 1999: Carlstroni et ai, 2002). Here we 
summarize only a few fundamentals. 

To lowest order, the shape of the distorted spectrum 
depends on a single parameter proportional to the prod- 
uct of the probability that a photon passing through the 
cluster will Compton scatter and the typical amount of 
energy a scattered photon gains: 

J rrieC^ 

where ctt is the Thomson cross-section and the integral 
is over a line of sight through the cluster. Because the 
optical depth of the cluster is small, the change in mi- 
crowave intensity at any frequency is linearly propor- 
tional to y <C 1, with reduced intensity at long wave- 
lengths and enhanced intensity at short wavelengths. 
Relativistic corrections in hot clusters add a slight fre- 
quency dependence to the magnitude of the effect, mak- 
ing cluster temperatures measurable with precise obser- 
vati ons of the microwave di stortion at several frequencies 
fsee lCarlstrom all . l2002l for a discussion). A cluster's 
motion with respect to the microwave background pro- 
duces additional distortion, known as the kinetic S-Z ef- 
fect, but here we will concern ourselves only with the 
thermal S-Z effect. 

Cosmological applications of the thermal S-Z effect in 
clusters benefit greatly from the fact that the effect is 
independent of distance, unlike optical and X-ray sur- 
face brightness. Thus, a dedicated S-Z cluster survey 
efficiently finds clusters out to arbitrarily high redshifts. 
Because not all these clusters will be well resolved, the 
surveys will be measuring an integrated version of the 
distortion parameter: 

Y = J ydAoc J UeTdV ; (8) 

where the first integral is over a cluster's projected sur- 
face area and the second is over its volume. The Y pa- 
rameter therefore tells us the total thermal energy of the 
electrons, from which one easily derives the total gas mass 
times its mass-weighted temperature within a given re- 
gion of space. If these regions can be chosen so that 
the gas mass is always proportional to the cluster's total 
mass, then the observable Y can be used a measure of 
cluster mass, once the relationship between Y and mass 
has been calibrated. 

The impressive power of the S-Z effect for finding dis- 
tant clusters also has a significant drawback, namely sky 
confusion owing to projection effects. Along any line of 
sight through the entire observable universe, the proba- 
bility of passing within the virial radius of a cluster or 
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group of galaxies is of order unity Ce.g.. lVoit et a/.ll200l[l . 
Because a cluster's S-Z distortion does not diminish with 
distance, many of the objects in a highly sensitive S-Z 
survey will therefore significantly overlap. Information 
on galaxy colors will help to separate nearby objects from 
more distant ones, but the implications of sky confusion 
for making accurate mass measurements are still a mat- 
ter to be reckoned with fe.g.. White et al. . 20_Q2 ). One 
way to avoid the problem of sky confusion will be to 
measure the statistical S-Z properties of clusters in the 
angular power spectrum of the mic rowave sky instead of 
analyzing the clusters theniselves jd a Silva et all . l200lt 
iHolder and CarlstromL l200lt ISeliak ah . i200H) . In fact , 
this statistical si gnal may already ha ve been detected 
llKno et r7iJ.Ei04t lPea,rson ^^"^12001 



2. Comparing S-Z with X-ray 

Comparisons between a cluster's X-ray properties and 
S-Z properties are useful in several different ways. X- 
ray observations are nicely complementary to S-Z obser- 
vations of clusters because they give the integral of 
along lines of sight through a cluster in addition to a 
gas temperature. Assuming that clusters are spherical 
objects with smooth gas distributions, one can divide 
the product of temperature and the line-of-sight integral 
of by the observed y value to obtain a cluster's gas 
density profile. Combining the data in this way can be 
particularly useful in studying the outskirts of clusters, 
where the X-ray surface brightness is difficult to observe 
but the S-Z signal remains substantial. With this den- 
sity profile in hand, one can then derive the line-of-sight 
thickness of the cluster from either the X-ray or S-Z ob- 
servations. This type of information could help to solve 
the S-Z projection problem in fields where there are high- 
quality X-ray and S-Z data. 

If a cluster is indeed spherical, then a comparison of 
its cluster's physical thickness with its apparent angular 
size directly gives the cluster's distance, which can be 
used t o determine the scale a nd geometry of the uni- 
verse l|Birkinshaw et al\ . Il99l[l . Deriving the scale of 
the universe in this way is subject to numerous system- 
atic effects. For example, clusters are not all perfectly 
spherical. Many appear slightly ellipsoidal in X-ray im- 
ages, calling for a sample of clusters with random ori- 
entations to beat down this systematic effect, although 
three-dimensional reconstructions are poss ible with the 
addition of gravitational- lensing data fe.g. lZaroubi et aZI . 
l200l|) . Note also that comparisons of X-ray images to S-Z 
images would produce nonsensical distances if the intr- 
acluster medium were highly clumpy, owing to the Pg 
X-ray emissivity. The fact that cluster distances found 
in this way are consistent with the standard calibrations 
of Bubble's Law indicates that the X-ray emitting gas is 
well-behaved and that most clusters are in approximate 
hydrostatic equilibrium. 



III. EVOLUTION OF THE DARK COMPONENT 

Cluster masses measured with the techniques outlined 
in the previous section range from around 10^'* to 
more than 10^^ Mq, the vast majority of which appears 
to be dark matter that emits no detectable radiation. 
Even using alternative theories of gravity, it is difficult 
to explain the cluster obser vations without dark matter 
dominating the overall mass llSandersl.l200l . In contrast, 
explaining the characteristics of clusters and their evolu- 
tion with redshift is much easier with models in which 
non-baryonic cold dark matter dominates the mass den- 
sity of the universe. 

This section explains how the evolution of the dark 
component of the universe, including both dark matter 
and dark energy, is thought to be reflected in the evolu- 
tion of cluster properties. It begins with a summary of 
the concordance model for cosmology and some closely 
related alternatives, all of which are predicated on the 
existence of non-baryonic cold dark matter. It then ex- 
plains how dark matter drives cluster formation in such 
models, providing some simple analytical approximations 
to the extensive numerical work that has been done on 
the subject. These models do a good job of accounting 
for the basic properties of observed clusters, allowing as- 
tronomers to measure several of the parameters in the 
concordance model using cluster observations, most no- 
tably the overall mass density of the universe and the am- 
plitude of the initial spectrum of density perturbations 
that eventually produces all the structure we observe. 

The accuracy of those parameter measurements is cur- 
rently limited by uncertainties in the relationships be- 
tween cluster masses and the observable properties that 
trace those masses. Numerical simulations of cluster for- 
mation do not yet provide precise calibrations of these 
relations because they do not yet account for all of 
the thermodynamical processes associated with galaxy 
formation. The third part of this section surveys the 
mass-observable relations and how the uncertainties in 
those relations affect cosmological parameters derived 
from them. The fourth part of this section examines 
how the properties of clusters evolve and how fitting 
that evolution with cosmological models improves the 
accuracy of the derived cosmological parameters. Even 
though current surveys of distant clusters contain rela- 
tively few objects, they already place strong constraints 
on the overall matter density. Larger cluster surveys in 
both the microwave and X-ray bands have the potential 
to place much stronger constraints on the overall cos- 
mological model, measuring both dark matter and dark 
energy parameters to 5% statistical accuracy, indepen- 
dently of other cosmological observations. 



A. A Recipe for the Universe 

Our current understanding of cluster evolution is an 
outgrowth of the overall cosmological model, whose pri- 
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mary features depend on just a handful of parameters. 
One set of parameters specifies the global cosmological 
model, which describes the overall geometry of the uni- 
verse, the mean density of its contents, and how its scale 
changes with time. The other important set of param- 
eters specifies the initial spectrum of density perturba- 
tions that grew into the galaxies and clusters of galaxies 
we see today. Here we define both sets of parameters 
and their roles in the context of the overall model. More 
extensive and detailed discussions of this recipe for the 
universe can be found in some of the excellent books on 
cosmology (e.g.. iPeacocfl Il999t iPeeblesl Il993|) . 



1. Global Dynamics 

The expansion of the universe can be characterized by 
a time-dependent scale factor a{t) proportional to the 
mean distance between the universe's galaxies. Hubble's 
Law relating the distance d between two galaxies and the 
speed V at which they appear to move apart can then be 
written as v = H{t)d, where H{t) = a/a \s the Hubble 
parameter. Many independent measurements indicate 
that the value of this parameter at the current tm\e tn 
is Hj tn) = i/o = 71 ± Tkms^^Mpc^i (Frccdman et oO, 
l200lh . The value of i?o, known as Hubble's constant, is 
often further distilled in the literature into the dimen- 
sionless quantity h = 7?o/(100 km s~^ Mpc~^). Some- 
times this review will use the more suitable alternative 
h^Q = 7?o/(70km s~^ Mpc~^) when characterizing ob- 
servable cluster properties. 

On very large scales, the universe appears homogenous 
and isotropic. Astronomers therefore assume that the 
time-dependent behavior of Hit) obeys the Friedmann- 
Lemaitre model of the universe, in which 

where p(t)c'^ is the mean density of mass-energy and p{t) 
is the pressure owing to that energy density. Local energy 
conservation requires that 

pc^ ^ -S-ipc" + p) , (10) 
a 

and we can use this expression to integrate the dynam- 
ical equation as long as we know the equation of state 
linking p and p. If the equation of state has the form 
p — wpc^, then density changes with the expansion as 
p (X a"^^^"*"™^. For a single mass-energy component with 
a constant value of w we therefore obtain 

a2 = ^p„a-(i+3»)+ const. , (11) 

where po is the value of the energy density when a = 1 
and the constant of integration is related to the global 
curvature of the universe. 



It is most convenient to normalize the scale factor so 
that it equals unity at the current time. Then the cos- 
mological redshift z of radiation from distant objects is 
simply related to the scale factor of the universe when 
that radiation was emitted: a = {\ + z)~^ . This defini- 
tion allows us to link the constant of integration to more 
familiar parameters, obtaining 

{^^ =7?o2[17o(l + z)3(i+'") + (l-17o)(l + z)^] , (12) 

where rig is the current energy density po in units of the 
current critical density pcro = SHq/SttG. 

Several different components of the universe, each with 
a different equation of state, can influence the overall ex- 
pansion history. Non-relativistic particles with a mass 
density pm contribute negligible pressure, corresponding 
to w = 0. The energy density prc^ in photons and other 
relativistic particles exerts a pressure with w = 1/3. Ein- 
stein's cosmological constant acts like an energy density 
Pac^ that remains constant while the universe expands 
and therefore exerts a pressure corresponding to w = —I. 
Including each of these components yields the dynamical 
equation 

H^z) = (^^j = H^iQuil + zf + flnil + zf 

+ nA + il-no)il + zf] (13) 

where fl^ is the current mass-energy density in com- 
ponent X in units of pcro and f2o = + ^^r + ^A- 
The value of Qx at an arbitrary redshift is given by 
n^iz) = r!,(l + zf(^+^^[H{z)/Ho]-'. 

Each of these energy-density parameters can be further 
articulated. The matter density parameter flyi consists 
of a contribution f2b from baryons and a contribution 
^^CDM from non-baryonic cold dark matter. The radia- 
tion density parameter includes contributions from the 
photons of the microwave background, ficMB, and from 
relict neutrinos produced in the Big Bang, fl^, as long 
as they remain relativistic particles. Finally, because the 
physical origin of the J^a term remains mysterious, it may 
not be correct to assume that the energy density respon- 
sible for it stays constant with time. In order to check this 
possibility observationally, one can replace the JIa term 
with a generalized dark-energy term 17a(1 + -z)'^*-^^'"^ and 
attempt to measure the value oi w ijTurner and Whit'3 . 
ll997HWang and Steinhardil. Il99a^ . 

Recent observations, including the cluster studies we 
will discuss later, have provided approximate values for 
many of these energy-density parameters, allowing us to 
estimate when each of the various energy components 
dominated the dynamics (Figure ^ . Dark energy with 
VIa ~ 0.7 seems to be most important at the current 
epoch, and because of the scaling of other terms with 
redshift, it will grow increasingly dominant as time pro- 
gresses. Non-relativistic matter appears to have a density 
corresponding to J^m ~ 0.3, implying that matter dom- 
inated the dynamics at z ^ 1. The radiation term was 
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FIG. 2 Evolution of energy densities with redshift. The various hues show the dependence of Qm{z), Qa{z), and ^Ir{z) on 
redshift for various sets of present-day cosmological parameters. Structure in the universe grows most rapidly while ^imiz) « 1, 
because positive density perturbations then exceed the critical density. This period of time occurs between the redshift Zoq when 
i^uizcq) ~ Q-R.{zcq) and the redshift at which Qm begins to drop. Notice that the redshift Zcq is earlier for larger present-day 
values of and that the redshift at which Om {z) begins to decline depends on the characteristics of dark energy. Observations 
of clusters and their evolution provide opportunities to constrain the values of Om, ^a, and w because the timing of both of 
these epochs influences the properties of the cluster population. 



most important in the distant past, prior to the redshift 
Zeq = ^m/^r — 1 of mattcr-radlation equality. Neutri- 
nos with masses less than a few eV will be relativistic 
particles at this epoch, leading to 



2.37 X lO^nuh 



(14) 



for TcMB = 2.73 K at = and three families of neutri- 
nos. 



2. Global Geometry 

Geometry in a universe that is homogenous and 
isotropic has the same radius of curvature everywhere, 
but its overall architecture can be either positively 
curved, flat, or negatively curved, depending on the value 
of Qq. Because the scale of the universe is changing 
with time, the most sensible coordinate system to use 
when describing its geometry is one that expands along 
with the universe. In such a comoving coordinate sys- 
tem, a radial interval in spherical coordinates has length 
a{t)dr, and the interval corresponding to a small trans- 
verse angle di/j = i/ dO'^ + sin^ ■ dcjP' depends on the ra- 
dius of curvature a{t)Ri^. For positive curvature, anal- 
ogous to the surface of a sphere, the transverse interval 



is a{t)R^sin{r/RK,)dip, and for negative curvature it is 
a{t)Rn smh{r/RK.)dip. 

We can therefore write the Robertson- Walker metric 
that describes such a universe as 



(15) 



where Sk.{x) = sinx for positive curvature (k = 1), 
Sk{x) = sinhx for negative curvature (k = —1), and 
a flat universe (k = 0) corresponds to R^ oo. The 
metric can be written in the more familiar form 



c^dT^ = c^dt'' - a^t) 



drl 



1 - >^^rl/Rl 



■rld^^ 



(16) 



with the definition = R^SKir/ Rk). Plugging this met- 
ric into Einstein's field equations leads to 



(17) 



which relates the radius of curvature to other cosmolog- 
ical parameters: 



Rk.- jr 
no 



Oo-l ■ 



(18) 



Notice that the universe at early times is effectively flat 
as long as + > because the horizon size of 



13 



the observable patch is ~ c/H{z) <C (1 + z)~^Rk. for 
observers at times corresponding to large values of the 
redshift z. 

The low- redshift universe may also be effectively flat, 
but that is not guaranteed. Consequently, both the ex- 
pansion of the universe and its curvature need to be taken 
into account when we observe highly redshifted objects 
like distant clusters of galaxies. Because the metric re- 
lates the comoving radial coordinate r to redshift through 
dr / dz = —c/H{z)^ the coordinate distance to an object 
with an observed redshift z is 



r{z) 



dz 



(19) 



Relations involving the divergence of light paths can then 
be compactly written in terms oir^^z) = RK.SK\r{z) / Rk\, 
which reduces to r{z) in a flat universe. For example, 
the angle subtended at coordinate distance r(z) by the 
transverse length I becomes 



(1 + ^)/ 



(20) 



In a flat, static universe, an object of physical size I would 
subtend this same angle if it were at the distance df^{z) — 
rK.{z)/{\ + z), sometimes called the angular-size distance. 
Likewise, the comoving volume within a solid angle dVL 
and a redshift interval dz is given by 



_ crl{z) 

dVldz H{z) 



(21) 



These formulae are useful to cluster cosmology because 
they allow us to constrain H{z) and the cosmological 
parameters that go into it if we know either the trans- 
verse sizes of high-redshift clusters or their number den- 
sity within a given comoving volume. Figure|31shows how 
the comoving volume of the universe depends on redshift 
for several different sets of cosmological parameters. 

When surveying the universe for clusters, we also need 
to know how the geometry and expansion of the universe 
affect the apparent brightness of a cluster and the galax- 
ies within it. The expansion alone reduces the energy flux 
received from a distant object by two factors of l-|-z, with 
one factor coming from the time dilation of the photon 
flux owing to expansion and the other from the redshift of 
the photons themselves. The observed energy flux from 
an object of luminosity L is therefore 



F = 



47r(l -f zYrl{z) 



(22) 



We would measure the same flux from an equivalent ob- 
ject in a flat, static universe if it were at a distance 
dhiz) = (1 -I- z)rK,{z), sometimes called the luminosity 
distance. The consequences for surface brightness, equal 
to flux per unit solid angle, are even more dramatic. In 
a flat, static universe, an object's surface brightness re- 
mains constant, but its surface brightness in an expand- 
ing universe is reduced by a factor d\/d\ oc (1 -t- z)~^. 
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Qy.n^.w = 0.3,0.7, 1 
nj,,Qj,w = 0.3,0.7, 0.8 
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n,„n. = 0,3,0.0 
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FIG. 3 Redshift dependence of comoving volume in various 
cosmologies. The quantity dVco/dz is the comoving volume of 
the entire sky between redshift z and z + dz, divided by the 
redshift interval dz. If clusters were a non-evolving population 
of objects, one could distinguish between these cosmologies 
simply by counting the number of clusters on the sky in each 
redshift interval. 



meaning that extended objects like clusters are far less 
bright at high redshifts. 

It would be nice if ri^{z) could be expressed analyti- 
cally for a general cosmology, but in most cases it can- 
not. However, a useful analytical expression for the di- 
vergence factor does exist for the case in which fl^ and 
Ha are neghgible (Mattig 1958): 



2c Qm-s + (f^M - 2) (VI + ^MZ - 1) 



ni,{i + z) 



(23) 



Usually one needs to integrate equation (|19|l numerically 
and then insert the results into the 5^ function to obtain 
the rest of the relations. 



3. Density Perturbations 

The very existence of galaxy clusters and the human 
beings who observe them demonstrates that the universe 
is not perfectly homogeneous. Therefore, the matter den- 
sity in the early universe must have been slightly lumpy. 
At some early time these perturbations away from the 
mean density (pm) correspond to an overdensity field 



<5(x) 

with Fourier components 



Pm{x) - (pm) 



{P 



(24) 



(25) 
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In the plausible case that 5(x) is isotropic, it can be char- 
acterized by an isotropic power spectrum 



pik) ^ m 



(26) 



If 5(x) is also a Gaussian random field, then P{k) is a 
complete statistical description of the initial perturbation 
spectrum. 

The physical meaning of P{k) becomes clearer if we 
assume it has a power-law form, with P{k) oc fc", and 
consider the variance in mass within identical volume el- 
ements corresponding to the length scale k~^. For exam- 
ple, let W{r) be a spherical window function that goes 
quickly to zero outside some characteristic radius rw and 
whose integral over all of space is unity. The mass per- 
turbation smoothed over the window is 



SM , 
IF 



(r) = / S{x)W{\x-r\)d^x 



(27) 



Using the convolution theorem, we can then write down 
the variance = {\SM /Mp) on this mass scale in terms 
of Wk, the Fourier transform of W{r): 



(27r)3 



P{k)\WkVd'k 



(28) 



The variance in mass on scale k for a power-law per- 
turbation spectrum is therefore cx fc"+'^, because the 
windowing averages out modes with k 3> . Thus, the 
typical mass fluctuation on mass scale M cx k~^ is 



5M 



cx M- 



(29) 



Notice that large-scale homogeneity of the universe re- 
quires n > — 3. 

It is also illuminating to consider how P{k) relates to 
fluctuations in the gravitational potential, 5$ cx kSM. 
The potential fluctuations owing to a power-law pertur- 
bation spectrum scale as cx /c*^""^^/^. The magni- 
tude of these fluctuations therefore diverges on either the 
high-mass end or the low-mass end, except in the case of 
n = 1. This special property of the P(k) cx fc powe r 
spectrum was noted i ndep endently b y Harr isonI l)l970(l . 
iPeebles and"Yull)l970(l . and lZeldovic'hl(tl97^) . Not only is 
this the most natural power-law spectrum, it also appears 
to be a good approximation to the true power spectrum 
of density fluctuations in the early universe. Inflation- 
ary models for the seeding of structure in the universe 
produce a Gaus sian density field wi th a power-law index 
close to n = 1 l)Guth and PiL 1198^ . which is consistent 
with the observe d fluctuations i n the cosmic microwave 
background fe.g.. lSpergel et al\ . \200^ . 



4. Growth of Linear Perturbations 

Once the universe has been seeded with density pertur- 
bations they begin to grow because the gravity of slightly 



overdense regions attracts matter away from neighbor- 
ing, slightly underdense regions. A complete treatment 
of perturbation growth is beyond the scope of this re- 
view, but some key features can be clarified with a simple 
toy model consisting of a uniform-density sphere that is 
slightly denser than its surroundings. The equation of 
motion for the radius R of an expanding homogeneous 
sphere is analogous to the one governing the universe as 
whole. Integrating equation @ with a — R/Rq, where 
Rq is an arbitrary fiducial radius at which p = po, gives 



7?2 AnGpoR^' 



-R 



-(l+3«i) 



(30) 



The constant of integration e in this equation is again 
related to spatial curvature but can also be interpreted 
as the net specific energy of the sphere. 

Now consider the behavior of two nearly identical 
spheres that both begin expanding from i? = at i = 
but have specific energies that differ by a small amount 
Se R^ /2. As these two spheres evolve, their radii will 
become slightly different by an amount R2 — Ri = SR, 
which satisfies the equation 



Ri 



dRa 
R2 



(31) 



In the linear regime, we can make the substitution i?2 ^ = 
(1 — Ri^Se)Ri^ . If we then take the sphere of radius Ri 
to be representative of the universe at large, we obtain 



5R 
~R 



5e a 
R^a 



" da 



(32) 



Because Sp/p = —3(1 + w)SR/ R, this model leads to the 
following growth function for linear perturbations: 



L>[a) oc — cx — 

p a Jq w 



" da 

~3 



(33) 



which is conventionally normalized so that D(a) = 1 at 
z — 0. Notice that the rate of perturbation growth im- 
plied by D{a) does not depend on the scale of the pertur- 
bation, implying that density perturbations on all scales 
grow in unison. 

This expression for the growth function is identical to 
those obtained through more rigorous arguments (e.g.. 
Heath, 1977;;£eebles, 19931). In a matter-dominated uni- 
verse, perturbation amplitudes grow in proportion to the 
scale factor a. In a radiation-dominated universe, they 
jrow cx a^. Handy n umerical a l gorith ms for computing 



D{a) can be found in HamiltonI l|200ll) . A good approxi- 
mation for the general case with a constant dark-energy 
density is 



D{z) 



2(1 + z) 



1 



rtAiz) 



70 



(34) 
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(see ICarroll et all Il992l: iLahav et all Il99l|) . 

If the dark-energy density is homogeneous but not con- 
stant in time, then the dark-energy density in the per- 
turbed sphere of radius R2 does not depend on its radius. 
In that case, one must solve a differential equation to de- 
termine the evolution of 5 = Sp/p in the linear regime. 
Differentiating R2 = Ri{l — (5/3) twice with respect to 
time and keeping only the lowest order terms leads to 



(35) 



in a universe wit h negligible radiation density. 
IWang and Steinhardtl l)l998|) derive a useful ap- 
proximation to the growth function by defining aw such 
that 



din (5 
din a 



(36) 



For a slowly varying equation of state {\dw / dflMiz)\ <C 
[1 - fluiz)]^^), they find that 



3 , 3 {l-w)il-3w/2) 

(37) 

to lowest order in 1 — Using this expression for 

aw in the integral 



D{a) « aexp (^J^ {1 - [nuiz)]"-} ^) 



(38) 



reproduces the growth function obtained from numeri- 
cal integration of equation (|33|) to better than 1% for 
nuiz) > 0.2. 

These growth functions are valid only as long as pres- 
sure gradients do not alter the dynamics of the perturba- 
tion. Pressure effects are not an issue when the scale of a 
perturbation is larger than the Hubble length cH~^ . In 
that regime the growth functions found by solving equa- 
tions H33(l and 135|l remain valid. Yet, as the universe 
ages, it encompasses perturbations of increasingly larger 
scale and additional physical effects enter the picture. 

The bad news is that a variety of processes alter the 
scale-free nature of the original perturbation spectrum. 
The good news is that the imprint of these processes on 
P{k) can tell us a great deal about the contents and dy- 
namics of the universe. During the radiation-dominated 
era of the universe {z > Zcq), pressure effects begin to 
alter the growth of a given mode when its wavelength 
is finally contained within the horizon length ~ cH~^ . 
Then radiation pressure can effectively resist gravita- 
tional compression, inhibiting further growth of modes 
at that wavelength. Instead, these modes in the cou- 
pled photon-baryon fluid begin to oscillate as acoustic 
waves, and eventually damp owing to photon diffusion 
out of higher-density, higher-temperature regions. Per- 
turbation growth in the dark-matter component there- 
fore stalls near the amplitude at which the perturbations 
were first contained within the horizon because the grav- 
itationally dominant photon component no longer spurs 



mode growth. These perturbations then resume grow- 
ing at Zgq) when matter begins to dominate the dynam- 
ics. The transition from radiation domination to matter 
domination therefore imprints a bend in P{k) on a length 
scale corresponding to the horizon scale at Zoq. 

Perturbation growth is scale-independent during the 
matter-dominated era only insofar as the matter can be 
considered cold on the scale of the perturbation. If the 
characteristic velocities of the matter particles are not 
small compared to the escape velocity from the pertur- 
bation, then both pressure forces and particles streaming 
out of denser regions can damp small-scale perturbations. 
Each effect of this type imprints its own characteristic 
feature on P{k). 

All of these scale-imprinting effects that alter P{k) 
from the time the primordial power spectrum is created 
until the present day are typically subsumed into a single 
quantity known as the transfer function, defined to be 



^ 5u{z ^ 0) 
5k{z)D{z) 



(39) 



where the symbol k refers to comoving modes with 
wavenumber (1 -I- z)k in physical space, a convention 
implicit throughout this review. The redshift z in this 
definition is assumed to be large enough that 5k{z) re- 
flects the original power spectrum imprinted by inflation 
or some other process. The transfer function therefore 
represents all the alterations of the original power spec- 
trum that subsequently occur, except for those involv- 
ing mode growth in the non-linear regime. If the pri- 
mordial spectrum is a power law of index rip « 1, then 
the power spectrum of linear perturbations at z = is 
P(k) cx k'^-T^ik). 



5. The CDM Power Spectrum 

The most successful models for the formation of large- 
scale structures like clusters of galaxies assume that cold 
dark matter (CDM) is responsible. Particles that inter- 
act only through gravity exert negligible pressure, and if 
their random velocities are small then they will not be 
able to escape from incipient potential wells on the scales 
of interest. That is, they will be too "cold" to damp the 
relevant perturbations by freely streaming out of them. 
Thus, the transfer function for a universe containing 
only radiation and cold dark matter has just one fea- 
ture, corresponding to the wavenumber of the mode that 
enters the horizon at the matter-radiation equality red- 
shift Zcq, with a comoving size Icq ~ cHq^ {QMZcq)~^^'^ ~ 
20(f)M/i^)"^Mpc. 

Growth of modes with smaller comoving wavelengths 
temporarily stalls from the redshift at which they enter 
the horizon until Zgq- Because radiation dominates the 
universe during this time interval, the comoving size of 
the horizon scales as a while the growth function scales 
as a^. Short- wavelength perturbations therefore miss out 
on a growth factor (fc^oq)^, corresponding to the square 
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of the change in scale factor from the time a perturba- 
tion enters the horizon to the time of matter-radiation 
equality. Growth of long- wavelength modes, on the other 
hand, does not stall at all. The behavior of the CDM 
transfer function in the two extremes is T{k) « 1 for 
k < l-^ and T{k) « (fc^cq)"^ for k > . For rip ^ 1, 
these scalings translate to SM/M ~ M~^/^ on large 
scales and 5M/M ~ const, on small scales, meaning that 
structure formation in a CDM universe is hierarchical, 
with small-scale perturbations reaching the non-linear 
regime before larger-scale ones. 

Numerical computations are needed to derive the exact 
CDM transfer function, but many authors have provided 
useful analytical fits to those numerical results. One such 
expression is 

T{k) = Ml±^[i + 3.89,-1- (16.1,)^ 

+ (5.46g)3 + (6.71g)^]-i/4 , (40) 

with q = k iyiMh'^)''^ Mpc jBardeen et aZ.LIl986() . Allow- 
ing for trace populations of baryons and massive neu- 
trinos alters the CDM power spectrum in minor but 
interesting ways. For example, a small proportion of 
baryons lowers the apparent dark-matter density param- 
eter, causing a shape-preserving shift in the CDM trans- 
fer function ijPeacock and Doddl . Il994() . This shift can 
be reproduced by setting q = fc (F/i)"-^ Mpc, so that 
it includes a shape parameter F = JIm^ exp[— r2b(l + 
V2h/nyi)] llSugivamailT99l . Fitting formulae accomo- 
dating additional modifications owii ig to baryons and 
massive neutrinos can be found in lEisenstein and Hul 
lfl998..1999.) . 

6. Power Spectrum Normalization 

The preceding sections give the theoretical expecta- 
tions for the shape and growth rate of the density per- 
turbation spectrum but do not specify its normaliza- 
tion. Because infiationary theories do not make firm 
predictions about the amplitude of the primordial power 
spectrum, the normalization of P{k) must be deter- 
mined observationally. For example, measurements of 
the present-day mass distribution of the universe indi- 
cate that 5M/M K, 1 within comoving spheres of ra- 
dius 8/i~^ Mpc (Sec. nil. C|) . as suggested by early galaxy 
surveys showing that the variance in galaxy counts was 
of order unity on this length scale ijDavis and Peeblel . 
[l983). 

This feature of the universe is the motivation for ex- 
pressing the power-spectrum normalization in terms of 
the quantity erg, where 

4 = J PikWki'd^k (41) 

is the variance defined with respect to a top-hat window 
function W{r) having a constant value inside a comoving 



radius of 8h~ Mpc and vanishing outside this radius. 
When using this formula, one must keep in mind that 
P{k) refers to the power spectrum of linear perturba- 
tions evolved to z = according to the growth function 
D{z), which is valid only for small perturbations. There 
are other ways of characterizing the power-spectrum nor- 
malization, but (Tg is the most widely-used parameter. 

7. Summary of Cosmological Parameters 

At the beginning of this recipe, we promised to encap- 
sulate the overall cosmological model in two small sets of 
parameters. The set governing the global behavior of the 
universe consists of Hq, r^M, ^^b, ^^r, f^A, and w. The 
set governing the initial density perturbation spectrum 
consists of (Tg and Up. The shape parameter F is not a 
free parameter in standard cold dark matter models but 
is sometimes treated as a free parameter in order to test 
variants of the standard model. 

In the concordance model, also known as the ACDM 
model, to denote cold dark matter with a cosmological 
constant, these parameters are all assigned values close 
to the most likely values implied by observations: 

• Hubble's constant. The consensus value of this 
parameter, measured primarily from the expan- 
sion rate of th e local universe is H o = 71 ± 
7kms~iMpc-i l|Freedman et ai!.l . l200lh . 

• Matter density. Several different methods involv- 
ing clusters indicate that JIm ~ 0.3 (Sec. IIII.CI 
Sec. IIII.D.3|I . Combining the results of distant su- 
pernova observations and observations of temper- 
ature patterns in the microwave background gives 
a similar value for this parameter. Figure^ shows 
one example of these mutual constraints in the Qm- 
r^A plane. 

• Baryon density. The abundances of light ele- 
ments formed during primordial nucleosynthesis in- 
dicate that Ob = 0.02/i^^, equal to flh = 0.04 
for t he value of Hubbl e's constant given above 
fe.g.. iBurles et 0,0 . 120011) . This value is consistent 
with the baryon density inferred from the fiuctu- 
ations in the cosmic microwave background (e.g., 
ISnergel et oiJ . l2003^ . 

• Radiation density. The energy density ^Ir in elec- 
tromagnetic radiation is simply calculated from 
the microwave background t emper ature Tcmb = 
2.728 ± 0.004 llFixsen et all \l99(t\ and Hubble's 
constant. Neutrinos may also contribute to the en- 
ergy density in relativistic matter, if their masses 
are sufficiently small, but this contribution is cur- 
rently too small to affect the global dynamics. 

• Dark energy density. Observations of distant su- 
pernovae imply that the expansion of the universe 
is accelerating at a rate consistent with a constant 
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dark-energy density corresponding: t o « 0.7 
ijPerlmutter et all Il999t iRiess et all Il998l l2004|) . 
Combining the matter density inferred from clus- 
ters with the flat geometry inferred from tempera- 
ture patterns in the m icrowave background corrob- 
orates this result fe.g.. lBahcall et aELIl999l see also 
Figure 

• Dark energy equation of state. Observations of mi- 
crowave background patterns, when combined with 
observations of large-scale structure are consistent 
with Einstein's cosmological constant (w = —1.0 ) 
but not with w > -0.8 l|SDergel et a.l\ . |2003). 
Alternatively, combining cluster surveys with ob- 
servations of distant supernovae leads t o simi- 
lar co nstraints, w = — 0.951q35 (Schucc ker et all 
120031) . However, theoretical arguments suggest 
that the parameter w may be redshift dependent 
(Peebles and Ratra, 2003). 

• Normalization of density perturbations. The cluster 
observations discussed in Sec. lUl.CJl and Sec. 1111. lOl 
indicate that the power-spectrum normalization 
falls into the range ag « 0.7 — 1.0. This range 
is consistent with both structures in the cosmic 
microwave background and other observations of 
large-scale structure. 

• Slope of primordial perturbation spectrum. All 
available information indicates that « 1. Con- 
straints from observations of the microwave back- 
ground, when combined with optical observations 
of large-scale struct ure, give Up = 0.97 ± 0.03 
llSpergel et o,l\ . \2(M\ . 

• Shape parameter of perturbation spectrum. The 
concordance values of flu, ^h, and Hq given above 
imply r Ri 0.2, which agrees with the value of T 
deriv ed from observations of lar g e-scale structure 
(e.g., iPeacock and Dodd^ Il994l : ISchuecker et all 
l200ltlSzalav et all\2Q()?t\ . an important element of 
self-consistency in the concordance model. 

Several other closely related models have been pur- 
sued during the last two decades, but none of them have 
proven as successful in explaining such a large number 
of observations. Here are a few variants some of which 
will be discussed later in connection with the parameter 
constraints derived from cluster surveys: 

• Standard cold dark matter (SCDM). In this model, 
the universe is assumed to be flat, with no dark en- 
ergy, so r^M = 1 and Ha — 0. For this value of the 
matter density, measurements of large-scale struc- 
ture, including clusters, imply that as = 0.4 — 0.5 
(Sec. 1111. C|) . However, the shape parameter im- 
plied for this model, given the observed Hubble 
constant, is F « 0.7, which conflict s with obser- 
vations of large-scale structure (e.g.. ISzalav et al\ . 
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FIG. 4 Cosmo l oKical constraints from cluster evolution 
jyikhlinin et qiZ. [ [ 2003^ , supernovae ijPerlmutter et all Il999l : 
Ik^s^^ZI^QSI) , and WMAP observatio ns of the cosmic 
microwave background ijSpergel et all^()0?t) . The horizontal 
axis labeled fl gives the value of Qm the vertical axis labeled 
A gives the value of Qa. These particular contraints from 
cluster evolution are based on the baryonic mass function of 
clusters (Sec. 1111. D.3II . but other measures of cluster evolution 
give similar results. The complementarity of these constraints 
is evident from the figure, and the common region of overlap 
near flu ~ 0.3 and Q,a = 0.7 is reassuring evidence of con- 
sistency in the overall picture. (Figure courtesy of Alexey 
Vikhlinin.) 

1200^. Also, this value of JIm leads to a baryon-to- 
dark-matter ratio fb = ^h/^M that is inconsistent 
with cluster observations (Sec. IIII.C.7|) . 

• Tilted cold dark matter. One way to make models 
with = 1 more consistent with large-scale struc- 
ture observations is to assume that the primordial 
perturbation spectrum is "tilted" so th at it is sig- 
nifica ntly shallower than Up = 1 (e.g., ICen et aU . 
mil). However, such models conflict with the 
strong constraints on Up inferred from microwave 
background observations. 

• Ad hoc power .spectrum (rCDM). Another option 
for making ^Im — 1 models more consistent with 
observations is to arbitrarily adjust the shape of the 
perturbation spectrum to fit the obs ervations. One 
such realization is the tCDM model ijjenkins et all 
Il998ft . which sets flu = 1, cts = 0.5 and F = 0.2, 
even though there is little physical justification for 
having such a low value of F in a co smology with 
such a large matter density (but see lWhite et all 

• Open cold dark matter (OCDM). There is also the 
option of accepting the evidence that I^m ~ 0.3 but 
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dispensing with dark energy = 0) so that the 
universe has an open geometry. In this case, the 
perturbation spectrum can be identical to the one 
in the ACDM concordance model, but the growth 
rate of those perturbations differs because of the 
changed expansion rate at late times. 

This review does not consider models involving forms 
of dark matter other than cold dark matter, but it 
does consider models with generalized forms of dark en- 
ergy having w 7^ — 1 (wCDM). Current cluster surveys 
are not yet large enough to place strong constraints on 
this equation-of-state parameter but they might pro- 
vide much stronger constraints in the coming decade 
(Sec.|lILD3l- 

B. Cluster Formation 

Cluster formation from perturbations in the density 
distribution of cold dark matter is a hierarchical process. 
Small subclumps of matter are the first pieces of the clus- 
ter to deviate from the Hubble flow and undergo gravita- 
tional relaxation because the density perturbations have 
larger amplitudes on smaller mass scales. These small 
pieces then merge and coalesce to form progressively 
larger structures as perturbations on larger mass scales 
reach the non-linear regime. A full understanding of the 
details of this hierarchical merging process requires nu- 
merical simulations, but simplified, spherically symmet- 
ric models of cluster formation illustrate many of the im- 
portant concepts. This part of the review shows how a 
cluster would grow from a spherically symmetric mass 
perturbation and then refines the details of that sim- 
plified approach, based on what we have learned from 
numerical simulations. 



where rgh is the shell radius and Afgh is the mass enclosed 
within Tsh- Throughout the early evolution of a spherical 
perturbation, the value of Mgh within a given mass shell 
remains constant. Thus, if the dark-energy term is negli- 
gible, the radius of a mass shell obeys the parametric so- 
lution rsh = rta[(l-cos6'A/)/2], t = tc[{6M- sin 9 m)/2tt], 
with a turnaround radius rta — [{2GMshtc)/'n''^]^^'^ for a 
shell that collapses to the origin at time tc . The solution 
for Q\ sa 0.7 and « — 1 is not much different because 
the dark-energy term remains ^ 15% of the matter term 
during the trajectory of all shells that collapse to the ori- 
gin by the present time. If greater accuracy is needed, 
a shell's trajectory can be computed numerically from 
equation 

Once a shell collapses, the mass within it no longer 
remains constant. Because the dark matter within a col- 
lapsing shell is collisionless, shells on different trajectories 
can easily interpenetrate. The radii of collapsed shells 
in this idealized geometry therefore oscillate symmetri- 
cally about the origin, and the amplitudes of these oscil- 
lations modestly decrease with time as mass associated 
with other collapsed sh ells accumula tes within the oscil- 
lations' turning points ijCunnL [T977|) . 

The accretion process in real clusters is not so symmet- 
ric. Instead, gravitational forces between infalling clumps 
of matter produce a time- varying gravitational potential 
that randomizes the velocities of the infalling particles, 
yielding a Maxwellian velocity distribution in which tem- 
perature is proportional to the part icle mass. This pro - 
cess, known as "violent relaxation" (IliidenlBellliHl, 
leads to a state of virial equilibrium in which the total 
kinetic energy Ex is related to the total gravitational 
potential energy Eq through the equation 

Eg + 2Ek = 47rPbr^ (43) 



1. Spherical Collapse 

The most basic features of cluster formation can be 
understood in terms of a spher i cally symmetric collapse 
model fe-g.-lBertschingeilllOSSt iFillmore and GoldreichL 
ll984tlGunn and Gottl.ll972|) . In such a model, the matter 
that goes on to form a cluster begins as a low-amplitude 
density perturbation that initially expands along with 
the rest of the universe. The perturbation's gravitational 
pull gradually slows the expansion of that matter, even- 
tually halting and reversing the expansion. A cluster 
of matter then forms at the center of the perturbation, 
and the rate at which additional matter accretes onto the 
cluster depends on the distribution of density with radius 
in the initial perturbation. 

In a geometry that is perfectly spherically symmetric, 
the behavior of an individual mass shell in the presence 
of a homogeneous generalized dark-energy field follows 
the equation of motion 

f,, = ^^-i±^f^^i/2(i + ,)3(i+u.),^, , (42) 

^sh ^ 



where Pb is the effective pressure owing to infalling 
matter at the boundary rb of the collapsed system 
(Sec. III.A.2|I . Setting Pb to zero yields the usual form 
of the virial theorem for gravitationally bound systems. 

A common toy model for estimating the location of a 
cluster's outer boundary is the spherical top-hat model, 
which assumes that the perturbation leading to a cluster 
is a spherical region of constant density. All of the mass 
shells in such a perturbation move in unison and collapse 
to the origin simultaneously. The virial theorem there- 
fore suggests that the bounding radius of the cluster after 
it collapses and relaxes should be in the neighborhood of 
half the turnaround radius. Numerical simulations in- 
deed show that particle velocities within this radius are 



* Hero we are making the standard assumption that the collaps- 
ing dark matter has no effect on the local dark-energy density 
(e.g.,IWang ancT steinhardt. 1998; Weinbers and KamionkowskJ, 
120031) . If in fact the dark- matter collapse alters the local prop- 
ertie s of dark energy, the dyna mics could be somewhat altered 
IMota and van de Bruckl I200I) . 
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generally isotropic and those outside this radius are gen- 
erally infalling, but the boundary between the isotropic 
and i nfalling regions is not parti cularly distinct (|Evrardi 
ll99nHNavarro and Whit j. 11993^. 

The spherical top-hat model has actually led to sev- 
eral different definitions for the virial radius of a clus- 
ter. If one assumes that all the mass in the original 
top-hat perturbation ends up within rta/2, then the 
mass density in that region is (SMjirr^^. In a matter- 
dominated universe with zero dark energy, this density 
is equal to Av = 87r^/(_ffi)^ times the critical density 
Per = 3H^/8ttG. Thus, for a flat, matter-dominated uni- 
verse in which Ht = 2/3, the mean density of a pertur- 
bation that has just collapsed is taken to be IStt^ « 178 
times the critical density. A useful approximation for Av 
in a flat universe with a non-zero cosmological constant 
(w = —1) is 

Av = IStt^ + 82 [nuiz) - 1] - 39 [nuiz) - 1]^ (44) 

l|Brvan and Normail Il998^ . Because the outer radius of 
a real cluster is not distinct, one pragmatic definition 
of the virial radius is then the radiu s within whic h 
the mean matter density is AvPcr ijEke et all Il99(il) . 
However, the numerical value of Av in a flat, matter- 
dominated universe has inspired other definitions. A 
common alternative is the scale radius r2oo , within which 
the mean matter density is 200/3cr- Another frequently 
used scale radius is rigom, within which the mean mat- 
ter density is 180 times the mean background density 
^Miz)pci- As long as ^m{z) « 1, both of these scale 
radii are nearly identical to r^, but because JIm ~ 0.3 at 
the present time, these radii are now somewhat different, 
with 7-200 < ?-v < Tisoai- This multiplicity of definitions 
for the radius of a cluster is a potential source of confu- 
sion, but as we will see below, each of these scale radii 
can be particularly well suited to certain applications. 

2. Cluster Mass Profiles 

Observations of galaxy clusters have long indicated 
that the velocity dispersion of a cluster's galaxies re- 
mains relatively constant with distance from the clus- 
ter's center, implying an underlying mass-density profile 
Pm(^) oc r~^. The simplest analytical cluster model con- 
sistent with such a density profile is the singular isother- 
mal sphere, in which the velocity dispersion ct„ is con- 
stant an d isotropic at every point a nd puir) = cr^/27rG'r^ 
fe.g.. Binnev and Tremainelll987|) . This model is useful 
for making analytical estimates of cluster properties, but 
it is incomplete because the total mass diverges linearly 
with radius. 

Numerical simulations of cluster formation produce 
dark-matter halos whose density profiles are shallower 
than isothermal at small radii and steeper than isother- 
mal at large radii. A generic form for representing these 
profiles is 

PmW cxr-f(r + r,)P-« , (45) 



where the parameters p and q describe the inner and 
outer power-law slopes and the radius specifies where 
the profile steepens. Groups that have fit such profiles to 
simulated clusters disagree about the best values of p and 
q but typically find 1 ^ p ^ 1.5 and 2.5 ^ g <■ 3. Specific 
examples include the NFW profile, with p = 1 and q = 3 
l|Navarro et all 11997*). the Moore profile, with p ~ 1.5 
and (7 = 3, and the Rasia__e^ al. ( 2003) profile, with p — 1 
and q — 2.5. Both optical and X-ray observations in- 
dicate that density profiles of this sort are good repre- 
sentations of the underlying mass profil es of clusters, at 
least outside of the innermost regions jC arlbcrg et all 
ll997bHLewis et ad I200I iPratt and Arnaud. .2002.1 . Ob- 
serving the asymptotic inner slope p is currently a mat- 
ter of great observational interest, as the cuspiness of 
dark-matter density profiles at r = is one of the acid 
tests of the CDM pa radigm for structure formation (see 
iNavarro et all l2003l and references therein). However, 
we will not discuss that issue here because the global 
properties of clusters depend little on the value of p. In 
this review, we will use the NFW profile when necessary 
because it remains thee most widely used fitting formula 
for representing the results of cosmological simulations. 

The transition of the density profile from shallow to 
steep can also be expressed in terms of a concentration 
parameter c = fb/fs, which expresses the bounding ra- 
dius of the cluster in units of . Because the concentra- 
tion parameter depends on rb, numerical values of c de- 
pend somewhat on whether the bounding radius is taken 
to be Tv, r2oo, ?'i80m, or something similar. However, 
these radii are not vastly different because they are gen- 
erally several times larger than , meaning that the en- 
closed mass is not rapidly diverging in the neighborhood 
of the virial radius. Typical concentration parameters 
for simulated clusters are in the range c ~ 4 — 10, with a 
scatter in Inc of 0.2-0.35 jjiniUoOO). Also, lower -mass 
objects tend to have higher halo concentrations because 
they formed earlier in ti me, when the overal l density of 
the universe was greater llBuUock et all l200lt lEke et al\ . 
1200 It INavarro et all 11997^ 



3. Defining Cluster Mass 

Even with these more sophisticated forms for the den- 
sity profile, mass still diverges with radius. Thus, a clus- 
ter's mass and all the relations linking that mass to other 
observable quantities depend on how one chooses to de- 
fine a cluster's outer boundary. One would like to define 
that boundary so as to maximize the simplicity of the re- 
lationships between cluster mass and other observables, 
but no single definition is best for all applications. 

The easiest way to link observations to theoretical 
models is through definitions taking the mass of a cluster 
to be Ma , the amount of matter contained in a spherical 
region of radius ta whose mean density is A -per- It is also 
common for cluster mass to be defined with respect to 
the background mass density, so that the mean density of 
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matter within the virial radius is A-QMiz)pcn but apply- 
ing this definition to observations requires prior knowl- 
edge of l^M- Spherical top-hat collapse suggests that Av 
is a good choice for the density threshold. However, ob- 
servers often prefer to raise that threshold to A = 200 
or even A — 500 for two reasons. The properties of a 
cluster are easier to observe in regions where the density 
contrast is higher, and simulations show that the region 
within rsoo is considerably more relaxed than the region 
within Tv. 

As an example of such definitions in action, consider 
the relation between velocity dispersion and the virial 
mass Mv obtained by truncating a singular isothermal 
sphere at the virial radius r^'. 
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where the factor /g. is a parameter that can be adjusted to 
account for t he fact that clusters are not perfect isother- 
mal sph eres terv an and Normanl IToOSt lEke et al 
lEvrardi |^89). The presence of this parameter is a re- 
minder that the derivation of this relation should not 
be taken too literally. Truncation of the mass distribu- 
tion at TA formally implies a non-zero boundary pres- 
sure that shifts the virial relation for this configuration 
so that Ek = —SEq/A, w hich is inconsistent with the 
definition of r^, l)Voitl . l2000|) . This functional form for the 
Mv-(Jy is useful primarily as a fitting formula that ac- 
counts for most of the cosmology-dependent changes in 
the normalization of the relation. However, because the 
density profiles of dark matter halos defined in this way 
depend on both mass and redshift, the correction factor 
/ct is not a universal constant. 

Recent work has shown that defining a cluster's mass 
using the threshold A = 200 leads to an M2oo-c„ relation 
that is remarkably independent of cosmology. lEvrardl 
ll20fl4l) finds that the relation 
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is an excellent fit to a wide range of simulated clusters 
sampled over a wide range of redshifts. This relation is 
equivalent to setting /o- = 1.2 and Av — 200 in equation 

Conversions between Mv and Ma defined with respect 
to an arbitrary A are straightforward as long as a clus- 
ter's concentration parameter is known. From the defi- 
nitions of these masses, we have 
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Mv 
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(48) 



and the ha lo concentration gives the relationship between 
TA and Tv. IHu and Kravtsovl l)2003|) have provided a use- 
ful approximation for this relation in the case of an NFW 
profile. Recasting their formulae in slightly different no- 
tation, one can write the halo concentration ca defined 



with respect to ta in terms of the concentration Cv de- 
fined with respect to r^: 
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(49) 
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with /c = (A/c3Av)[ln(l + Cv) - Cv/(1 + Cv)], Pc = 02 + 
aaln/c a4(ln/c)2 and (ai,...,a4) = (0.5116, -0.4283, 
-3.13 X 10-3, -3.52 X 10-5). Plugging the ratio ta/jv = 
ca/cv given by this approximation into equation l|48|) 
converts cluster masses with an accuracy ^ 0.3% for the 
halo concentrations typical of clusters. 

Some other definitions of cluster mass are useful in cer- 
tain contexts but are more difficult to relate to the top- 
hat collapse model. For example, observers who mea- 
sure cluster mass using gravitational lensing or the to- 
tal optical luminosity are essentially measuring cluster 
mass within a cylinder along the line of sight rather than 
a sphere. In principle, these observables can be linked 
with cluster masses defined with respect to a cylindrical 
boundary, but the relationships between those cylindrical 
masses and models of structure formation are not as well 
understood as their spherical counterparts. On the the- 
oretical side, the masses of clusters identified in numeri- 
cal simulations are sometimes defined using a "friends-of- 
frien ds" algorithm that links neighboring mass particles 
(e.g., iDavis et all Il985l) . However, clust ers def i ned i n 
this way often have irregular boundaries llWhitd . l200l . 
making this sort of definition difficult to apply to obser- 
vations. Masses defined within spheres also have their 
shortcomings, particularly in cases where two clusters 
are just beginning to merge, but in general provide the 
most direct link between cosmological models and obser- 
vations. 



4. Cluster Mass Function 

Some of the most powerful constraints on current cos- 
mological models come from observations of how clusters 
evolve with time. Because cosmological time scales are 
so long, we cannot observe how individual clusters evolve 
but rather observe how the demographics of the entire 
cluster population changes with redshift. A important 
conceptual tool in this effort is the cluster mass function, 
um^M) which gives the number density of clusters with 
mass greater than M in a comoving volume element. No- 
tice that the cluster mass function inevitably depends on 
how one defines cluster mass. 

Combining spherical top-hat collapse with the growth 
function for linear perturbations has led to a widely 
used semi-analytical method for expressing the clus- 
ter mass function in term s of cosmological parameters. 
iPress and Schechte^ l)l974j) pioneere d the basic approach , 
which was re fined and extended by iBond et al\ l)l99lf) . 
iBoweJ ||1991D . and lLacev and Cold l)l993|) . This class of 
models simplifies the problem of structure formation by 
assuming that all density perturbations continue to grow 
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according to the hnear growth rate D{z) even when their 
amphtudes become non-hnear. When perturbations are 
treated in this way, their variance on mass scale M as a 
function of rcdshift is 



(27r)3 
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(50) 



where Wk{M) = 3(sin fcrM — krM cos Utm) / {krMY '^ith 
fM — (3M/47rr2MPcro)^^'^ is the Fourier-space representa- 
tion of a top-hat window function that encloses mass M . 
The normalization of P{k) is set so that (t{M%, 0) — a% for 
Ms = (8/i-iMpc)3ij2j]j^/2G' = 6.0 x 10^'' I^m/i"^ Mq. 
These perturbations are then assumed to collapse and 
virialize when their density contrast 5 = 5p/ p exceeds a 
critical threshold 5c- 

Suppose the initial density perturbations are gaussian 
with a variance (M, z) that declines monotonically 
with mass. Then according to the Press-Schechter ap- 
proach, the probability that a region of mass M exceeds 
the collapse threshold at redshift z is erfc [Sd \/2(t{M^ z)], 
where erfc (x) is the complementary error function. Im- 
plicit in this expression is the notion that all the mass in 
the universe belongs to collapsed, virialized objects when 
viewed on sufficiently small mass scales. It then follows 
that the cluster mass function on scale M at redshift z is 
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This expression implies that the shape of the mass func- 
tion depends only on a{M, z) and remains invariant with 
respect to the characteristic collapsing mass scale M,(z) 
at which a{M^:^ z) = 6c- Observers often work with the 
mass function in a differential form, such as duM /dh\M ^ 
but theorists prefer expressing the differential form in 
terms of the shape-governing function (7{M, z). Then the 
differential mass function takes the form 
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Both of these forms for the mass function can be straight- 
forwardly extended to cases in which the perturbations 
are non-gaussian fe.g.. .Robinson et a l.. 2000). 

The value of the critical threshold 5c was originally 
inferred from spherical top-hat collapse. Expanding the 
parametric solution for spherical collapse in powers of 9m 
leads to the following relation at early times: 
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The leading term in this expression characterizes the be- 
havior of a critical-density sphere and the second term 
describes how the evolution of a slightly overdense sphere 
deviates from that of a critical-density sphere. Assuming 
that this deviation grows according to equation (|53|l un- 
til the moment of collapse and virialization [t = t^) gives 



the value of the critical threshold in a flat universe with 
= l- 5c = 3(127r)2/3/20 « 1.686. Generalizing this 
treatment to cases where VLyi ^ 1 produces only minor 
differences i n. 5c for i nterestin g values of the co smological 
parameters llEke et aL . .1996e iLacev and Colel>1993, ). 



The preceding derivation of the cluster mass function 
is not terribly rigorous, but it is useful because adopting 
5c = 1.686 leads to mass functions that agree reason- 
ably well with those derived from numerical simulations. 
Treating perturbation collapse as ell ipsoidal rath er than 
spherical improves that a greement jShe th et alV |2001). 
ISheth and TormenI l)l999|) have shown that the expres- 
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with As = 0.3222, as — 0.707, and Ps — 0.3 is quite 
an accurate representation of the mass functions from 
several different numerical simulations. However, be- 
cause semi-analytical mass functions like these are not 
rigorously derived, they are essentially just fitting for- 
mulae that conveniently express the simulation results 
and should treated cautiously outside the cosmological 
models against which they have been tested. 

A particularly well-tested fitting for mula for clus- 
ter m ass functions has been provided by iJenkins et al\ 
l)200lj) . Combining results for simulated clusters span- 
ning a mass range from < IO^^Mq to > IO-'^^Mq and 
sampled at a number of different redshifts, they found 
that the form of duM /dhia~^ was nearly invariant if 
they defined cluster mass to be Migom, the mass within 
a sphere of mean density 180 riM(2)pcr- When this defi- 
nition of cluster mass is used, the formula 



duM . f^MPcrO r 



lino-- 



(55) 



with Aj = 0.301, Bj = 0.64, and ej = 3.82, repro- 
duces the cluster mass function to ^20% accuracy for 
all the cosmologies tested, including ACDM, rCDM, and 
OCDM. In this expression, Aj governs the fraction of 
the total mass in collapsed objects, e^' functions as a 
collapse threshold analogous to 5c, and ej stretches the 
mass function to fit the simulations. The Sheth-Tormen 
mass function of equation (I54() fits these same numerical 
simulations nearly as well. 

The exponential sensitivity to mass and redshift evi- 
dent in these expressions for the cluster mass function is 
both a blessing and a curse. On the one hand, it makes 
cluster counts and their evolution with redshift a very 
powerful probe of cosmological parameters. Figure |31 
showing the cluster mass function and its evolution with 
time for five different cosmologies, illustrates how sensi- 
tive mass-function evolution is to the matter density. On 
the other hand, any systematic errors in the measurement 
of cluster mass, including inconsistencies in the definition 
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FIG. 5 Mass-function evolution in five different cosmologies. The fiducial model in all cases is the ACDM model with SIm ~ 0.3, 
= 0.7, w = —1, and ag = 0.9. The upper left panel compares cluster evolution in the ACDM case with a standard cold- 
dark-matter model (SCDM) having Qm = f .0, Qa ~ 0.0, and ag = 0.5. Evolution in the SCDM case is much more dramatic, 
and t he steeper slope of the mass function strongly disagrees with observations of local clusters fe.g.. iReiorich and^ BohringeJ. 
l2002fl . Retaining = 1.0 and CIa = 0.0 while adjusting the power spectrum so that T = 0.21 gives a rCDM model (lower left) 
in which the slope of the low-redshift mass function is more acceptable, but the evolution is still very strong. Dispensing with 
dark energy while keeping the matter density low gives an OCDM model (57m = 0.3, 57a = 0, as — 0.9; upper right) with less 
evolution than the ACDM case because structure formation starts to ramp down earlier in time (see Figure Dark energy in 
a uiCDM model identical to the ACDM model except with w = —0.8 (lower right) also slows cluster evolution relative to the 
ACDM case. 



of cluster mass, are also exponentially amplified by the 
steepness of the mass function. 



5. Cluster Bias 

Another observable feature of the cluster population, 
closely related to the mass function, is the tendency of 
galaxy clusters to cluster with one another. Fluctua- 
tions in the number density of clusters on large scales are 
observed to be more pronounced than the fluctuations 
of the underlying matter density (e.g., Bahcal l et aL 
2008at iBahcaU and Soneiral Il98.'^: ICollins eit all hom 
Klvpin and KopvlovL llQSalPostman et all Il99^ . In 
other words, the fractional deviation of dnM /dlna'^ 
from its mean value within a given volume of the uni- 
verse is observed to be larger than Sp/p in that same 



volume. The ratio b{M) between the perturbation in the 
number density of clusters of mass M and the perturba- 
tion amplitude of the matter density is known as the bias 
parameter, and it is taken to be independent of length 
scale, as long as that length scale is much larger than a 
cluster. 

Cluster bias can be interpreted as a modulation of 
the collapse thres h old by l ong- wav el ength density modes 
llCole and Kaiseil Il989l: iKaiseil 11984 IWhite et all 
1198/1) . The idea here is that a long-wavelength den- 
sity enhancement of amplitude Sp/p — e lowers the ef- 
fective collapse threshold for smaller-scale structures to 
Sc — e, thereby inducing an offset in duM / dhia~^ from 
its mean value on mass scale M . This contribution adds 
to the perturbation e in cluster number density owing to 
the amplitude of the large-scale mode. Dividing the sum 
of these two offsets by e leads to an expression relating 
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the bias parameter to the mass function JMo and WhiteL 
ll99fiHSheth and TormenL 119991) : 
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Plugging in the Sheth-Tormen mass function of equation 
(|^ produces 
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IHu and Kravtsovl l)2003! ) show that the parameter values 
as = 0.75 and Ps = 0.3 accurately reproduce the bias 
of cluster-sized halos seen in large-scale numerical sim- 
ulations, when cluster mass is taken to be Migom- No- 
tice that small values of a{M) lead to large values of 
b{M), meaning that rare, high-mass objects are much 
more likely to be found in regions of the universe where 
the surrounding matter density is higher than average. 



C. Measuring the Cluster Mass Function 

Equations H52|l . 154|) . and H55|l illustrate why cosmolo- 
gists are so enthusiastic about the cluster mass function. 
Dividing an accurate measurement of the mass function 
by r^MPcrO directly leads to an accurate measurement 
of the primordial power spectrum a{M) on mass scales 
~ 10^^ — 10^^ Mq. Furthermore, any uncertainty in pcro 
scales out of the power spectrum's normalization, be- 
cause measured values of cluster number density scale as 
h"^, making the quantity Msp^^^idnM / din a^^) indepen- 
dent of h. One is left only with a degeneracy between cjg 
and r^M- Taking the logarithmic derivative of H52() with 
respect to a at constant M shows that the mass function 
is roughly oc in the region where cr « 1. Hence, the 
measured level of that normalization in the local universe 
reflects the parameter combination agfl^, with a « 0.5. 

This degeneracy can be broken in three ways. First, 
one can simply measure flu or as in some other way. 
Second, one can measure the cluster mass function over 
a range of masses and rely on a precise measurement 
on the mass function's shape to break the degeneracy, 
assuming that the CDM power spectrum (Sec. IIII.A.5|) 
is valid. Or, third, one can measure the evolution of the 
cluster mass function, which is highly sensitive to f^M- 
We will explore that option in more depth in Sec. IIII.DI 
but first we need to examine some of the obstacles to 
accurate mass-function measurements. 



1. Linking Mass with Observables 

In order to measure the mass function using a large 
sample of clusters, we need to relate cluster mass to 
an easily observable quantity. Doing this properly re- 
quires a consistent definition of mass fSec. lIII.B.3|) and a 
well-calibrated relation linking that mass to some observ- 
able, but which mass definition works best? Expressions 



like the Jenkins mass function (equation 155(1 appear to 
be cosmologically invariant when cluster masses are de- 
fined with respect to the background mass density (e.g., 
Afisom, see Sec. IIII.B.4|I . On the other hand, the struc- 
ture of a cluster, as reflected in its dark-matter veloc- 
ity dispersion, seems to be cosmologically invariant when 
cluster masses are defined with respect to the critical 
mass density ( e.g., M200, see Sec. lIII.B~3)l . To paraphrase 
lEvrardl (|20^, Nature appears to do accounting relative 
to the mean mass density and dynamics relative to the 
critical density. 

Because simulations suggest that dynamical quantities 
like the galaxy velocity dispersion and the X-ray deter- 
mined gas temperature should be more tightly correlated 
with M200 than with Migom, we will take M200 to be the 
primary definition of cluster mass. Methods like those 
outlined in Sec. IIII.B.3I can then be used to convert a 
mass function in A/200 to one in Afigom- Alternatively, 
one can fit the results of large-scale simulations to de- 
termine a cosmology-dependent correction to the Jenk- 
ins mass function fo r use with the mass definition M2oo- 
lEvrard et all ((2002) have done that, finding that substi- 
tuting Aj = 0.27-0.07(1-Om), B] = 0.65-h0.11(l-OM), 
and ej = 3.8 into equation ((55(1 reproduces the M200 
mass function in simulations at 2 = 0. Despite the tight 
relationship between Af2oo f^nd the dark-matter veloc- 
ity dispersion in simulations, the link between M2Q0 and 
observable quantities is still a potentially large source 
of systematic error. Even if the galaxy velocity disper- 
sion were identical to that of the dark matter, accurately 
measuring that dispersion within a sphere of radius r2oo 
requires an enormo us amount of data to minimize pro- 
jection effects (e-g.- lRines et a^J . 120031) . 

To see how systematic errors corrupt the mass- function 
measurement, consider the general case for a generic ob- 
servable X. Suppose a cluster survey determines the co- 
moving number density distribution duM / dlnX within 
logarithmic bins of the observable. Converting this dis- 
tribution to a mass function diiM / d\na^^ via the chain 
rule requires, at minimum, knowledge of the normaliza- 
tion and effective power-law index ax = d In X/ d In M of 
the M2Q0-X relation over the observed range in X, as well 
as the effective power-law index a a/ = d In tT~^/d In M 
of the mass fluctuations. Fitting a semi-analytic ex- 
pression for the mass function like equation l(55() to the 
observations for a fixed value of then determines 
CTfit = a-(Mfit) on a particular mass scale Mgt, and con- 
sequently determines cts ~ (Mfit/M8)""crfit(Mfit). 

Any systematic offset AAf/M in the normalization of 
the M20V1-X relation produces a corresponding offset in 
the measured power-spectrum normalization: 
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(58) 
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fe.g.. lEvrard et'al I I2OO2I: ISelia^ I2OO2I: l\^ |2000() . The 

second line of this equation assumes that crfit has been 
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determined using the Jenkins mass function of equation 
l|55[l. On the mass scale of rich clusters 10^^'^ Mq), 
lEvrard el al\ ^^^^ find that the factor in parentheses 
is « 0.4. implying that a systematic 25% error in mass 
would lead to a 10% error in the measurement of erg- 
Below this mass scale the factor in parentheses increases, 
leading to an even larger error in the power-spectrum nor- 
malization for a given mass offset (Hutcrcr and White, 

ISboa. 

Dispersion in the value of the mass-tracing observable 
for clusters of fixed mass is another important source 
of uncertainty that must be dealt with carefully be- 
cause of the exponeiit ial slope of the mass function (e.g., 
iPierpaoli et a/.l . [2003(1 . Significant scatter boosts the nor- 
malization of dn M j d In X over the expectation for the no- 
scatter case, as the overall number of lower-mass clusters 
scattering to higher values of X far exceeds the number of 
higher-mass clusters scattering in the opposite direction. 
Underestimating this scatter leads to an overestimate of 
a% that can be particularly severe if the scatter has a 
long non-gaussian tail to large values of X. Situations 
in which such a tail could arise include merger shocks 
that substantially boost the temperature and luminos- 
ity in a signifi c ant subset of X-ray selected clus ters (e.g., 
iRandall et ^ . 120021: iRicker and SarazinL l200l|) and su- 
perpositions of galaxies that boost the apparent richness 
and velocity dispersion in an optically-selected sample 
llvan Haarlem all.ll997(l . 

Surveys that probe deep into the universe for clus- 
ters must also cope with redshift evolution in the mass- 
observable relation. That is, if M200 « X"^(l + z)''^, 
then one needs to know the value of hx- This source 
of uncertainty affects both the mapping of X onto mass 
for individual clusters and the number density one infers 
for clusters of a given mass from a survey based on the 
observable X. 

A sufficiently large cluster survey can circumvent many 
of t h ese systematic prob l ems t hrough self-calibration 
iLevine et oil . 120021: [Maiumdar an d Mohil 
This procedure treats all parameters de- 
scribing the systematic uncertainties in such things as 
the scatter, normalization, and evolution of the mass- 
observable relation as free parameters in the overall cos- 
mological model. Fitting a large number of clusters 
spanning a wide range in redshift to this overall model, 
one can then determine not only the global cosmolog- 
ical parameters but also the most likely values of the 
free parameters in the mass-observable relation. How- 
ever, treating the systematic uncertainties in this way 
has a cost. Each free parameter added weakens the 
statistical constraints on the cosmological measurements 
l|Maiumdar and Mohiil2003|) . 



2. Mass- Temperature Relation 

Among the observables that trace cluster mass. X-ray 
temperature has received considerable recent attention 



because it is closely related to the depth of a cluster's po- 
tential well and can be readily observ ed to 2: ~ 1 with cur- 
rent X -ray telescopes (Sec. IH.B.2|) . iHenrv and Arnaudl 
l|l99l|) pioneered the technique of measuring the clus- 
ter mass function with X-ray temperatures, using cluster 
temperatures determined at z w with the Einstein^ Ex- 
osat, and HEAO/OSO satellites. Cluster temperatures 
measured with the ASCA satellite improved the preci- 
sion of this measurement (|lkebe et al\ . l2002() , and tem- 
peratures measured with the Chandra and XMM-Newton 
telescopes should improve that precision even more. Be- 
cause the data are now of such high quality, systematic 
uncertainty in the link between mass and temperature is 
the main factor limiting this technique. 

Mass and temperature ought to be simply related for a 
cluster in hydrostatic equilibrium. The gas temperature 
of a singular isothermal sphere with mass M200 inside 
radius r2oo is 
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Realistic departures from hydrostatic equilibrium can be 
assessed with simulations of structure formation that in- 
clude hydrodynamics, but they do not have a large ef- 
fect on the mass-temperature relation. These models 

2/3 

indeed find that gas temperature scales with Mjqq , so 
that T fti ^ ^nn- with a scatter of only 10-15% (e.g., 
lEvrard et a^J .ll996: Frcnk et aL, 1999). However, to cal- 
ibrate the mass-temperature relation more precisely, we 
need a more specific definition of temperature. 

Clusters are not perfectly isothermal, so any single 
number specifying a cluster's gas temperature is some 
sort of weighted mean. The luminosity-weighted mean 
temperature Tium obtained by weighting each gas par- 
cel's temperature by oc Pg is a popular choice for compar- 
ing theory with observations because each temperature 
component contributes in proportion to its photon flux in 
the cluster's overall spectrum fSec. IH.Rl2l . The spectral- 
flt temperature Tgp has not yet received much attention 
in theoretical work because it depends s omewhat on the 
proced ure used to fit the spectrum, but iMazzotta et al\ 
l|2004|) have recently developed a temperature weighting 
scheme for theoretical models that appears to track Tgp 
quite closely. 

A bewildering variety of parameters has been used in 
the literature to express the mass-tempcrature relation's 
normalization. Here we will express the normalization 
of the Af2oo-21um relation in terms of 71um/T'20o- This 
choice has two advantages: It does not link the normal- 
ization to any particular mass or temperature scale, and 
21um/72oo = 1 for both a singular isothermal sphere and 
an isothermal beta model with (3 — 2/3. 

Uncertainty in this normalization factor is currently 
the single most important issue afflicting cluster mass- 
function measurements with X-ray observations. Tabled 



25 



TABLE I Normalization of the Mass- Temperature Relation 


Models without Radiative Cooling 


rium/2200" 




Navarro et al. (1995) 
Evrard et al. (1996) 


0.99 




0.91 




Brvan and Norman f 19981 


U.oU 




1 nonias ec dt. izuuii 


u.yo 




iViuanwon£ ez ai. \ zuuz ) 


U.O / 




Muanwone et al^ (2002)'^ 


n on 




Observations 


-'lum/-'200 




xioiiici et Uih. \ ±yyy i 


(^ ns + n nA\T^ 

\l-.UO 31 U.U^J±^ 


-U.iU 


Horner et al. { 1999/ 


(1.40±0.16)rg- 


'0.02 


Ncvalaincn et_al. imQQY 


(1.20±0.12)rg 


-0.20 


Finoeucnov et al. (_2001b)''''' 


(1.18±0.10)Tg- 


-0.33 


Finoaucnov et al. f2001b)'''' 
'Finoaucnov et al. {2mihf'^ 


(1.26±0.11)rg" 


-0.19 


(1.33±0.18)rg- 


-0.02 


Models with Radiative Cooling 


-t lum / J 200 




Muanwone et al. ('2002) 
Muanwone et al. (2002)*^ 
Boraani et al. (2003)'' 
Boreani et al. f2Q0a)''' 


0.79 Tg-"-^^ 
0.88 Tg-o-os 
(1.03 ± 0.03)rgr 
(1.24 ± 0.03)rgr 


-0.06 
-0.06 



"Te = fcBTiu^/6keV 

'Conversion from Afsoo assumes M200 = 1.4M5oo. 

"^Tium computation includes gas with cooling time < 6 Gyr 

"^Tlum computation excludes gas with cooling time < 6 Gyr 

'^Masses estimated using isothermal beta model 

.'^Masses estimated using polytropic beta model 

^Conversion from Afiooo assumes M200 = S.OMiooo- 

''Pull sample, masses from isothermal beta model 

'Pull sample, masses from polytropic beta model 

■'Subset with fcBTiumi polytropic beta model 

*^Tiuni computed with cooling cores removed 

'Masses inferred from polytropic beta-model fits. 



provides some recent observational and theoretical cali- 
brations of this relation, in three different groups. The 
first group gives calibrations from hydrodynamical simu- 
lations that do not account for galaxy formation, which 
generally fall into the range Tiu,„/T2oo = 0.8 — 1.0. In 
some cases, an Msoo-Tium relation has been converted 
to M200 assuming M200 = l-4M5oo, appropriate for halo 
concentration c = 5. The second group gives calibra- 
tions inferred from observations, which fall into the range 
Tium/?2oo = 1.1 — 1.4. In other words, clusters of a given 
temperature seem to be 30% to 60% less massive than one 
would expect from the simulations. Apparently galaxy 
formation changes the normalization of the -M2oo-71um 
relation, for reasons discussed in detail in Sec. IIVI al- 
though some of this discrepancy may also stem from sys- 
tematic offsets in the observational interpretation. The 
third group of normalization factors, which tend to lie 
in between the first two groups, come from simulations 
that attempt to account for the effects of galaxy forma- 
tion. Given this uncertainty in the mass-temperature 
normalization, the systematic uncertainty in cg values 
derived from cluster temperatures is about 25%, because 
CTs oc (Ti nm/T-jnn)^^^ accordi ng to equation for rich 
clusters (jiEvrard aZ.U2002|) . 

Efforts are underway to reconcile the observed nor- 



malization of the M2oo-'7ium relation with theoretical 
expectations. Some of the discrepancy probably arises 
from systematic errors in the masses derived from X- 
ray observations. Hydrostatic equilibrium is usually as- 
sumed, but the turbulent velocities in simulated clus- 
ters can sometimes be ~20-30% of the sound speed, in 
which case the hydrostatic assumption would lead to 
masses underestimated b y 10-15% (Rasi a et al.. 200^ 
iRicker and Sarazinl l200l|) . In addition, the beta model 
formalism often used to derive cluster mass may have sys- 
tematic problems. Applying this model to simulated clus- 
ters tends to underestiinate t heir masses (jBorgani et al\ . 
l2003HMuanwong et aZ.U2002l see the last line of TableUll, 
and recent XMM-Newton observations suggest that the 
correction for tem perature g radients may be excessive 
l|Mushotzkv, 2004: .Pratt and Arnaud..2002>.200^ . 

Alternative mass measurements would be very help- 
ful in solving these problems. Calibration of the mass- 
temperature relation with lensing observations has met 
with mixed success. Weak-lensing measurements of mas- 
sive, relaxed clusters agree with the masses derived from 
X-ray data under the a ssumption of hydrostatic equilib- 
rium (^Ue^^^OiEO^)- However, t hat agreement seem s 
to be poorer for less relaxed clusters ijSmith et "^■ 12001 . 
Measurements of cluster mass from the galaxy velocity 
field in and around a few very well observed clusters also 
tend to support the X-ray derived masses l)Rines et all 
120031) . Because the calibration may depend systemati- 
cally on how clusters are selected, self-calibration of a 
large cluster survey may ultimately be the best way of 
calibrating the Af2oo-71um relation. A thorough under- 
standing of how galaxy formation affects that relation 
would help reduce the number of free parameters that 
need to be calibrated, thereby reducing the statistical 
uncertainties achievable with self-calibration. 



3. Mass-Luminosity Relation 

X-ray luminosity also correlates well with cluster 
mass and is easier to measure than X-ray temperature, 
allowing for mass-function measurements using much 
larger cluster samples. However, the correlation be- 
tween mass and luminosity is not as tight as that be- 
tween mass and temperature, having a scatter ^^50% 
l|ReiDrich and Bdhringerl |2002|) . Additionally, the nor- 
malization and slope of the relation depend heavily on 
the physics of galaxy formation (Sec. llV.?^ . Because our 
understanding of the connection between galaxy forma- 
tion and a cluster's X-ray luminosity is not yet mature 
enough to calibrate the mass-luminosity relation with 
simulations, we need to rely solely on observational cali- 
brations. 

One common way to calibrate the mass-luminosity 
relation is to combine the mass-temperature relation 
with the observed lu minosity-temperature relation (e.g., 
iBorgani et all Il999fl . On cluster scales, the relation 
between the total (bolometric) X-ray luminosity and 



26 



TABLE II Luminosity- Temperature Relation at z ~ 



Source 


is" 


Alt 


Edzc and Stewart (1991) 


6.3 ± 1.3 


2.62 


± 


0.10 


David et al. (1993) 
Markevitch (1998)' 


5.6 ±0.9 


3.37 


± 


0.05 


6.4 ±0.6 




_|_ 


n 97 

U.Z ( 


Allen and Fahian ('1998')'^ 

Allen and Fabian (1998)"^ 


5.7 ± 3.4 


2.92 


± 


0.45 


14.6 ± 7.3 


3.08 


± 


0.58 


Arnaud and Evrard (1999)'= 


5.9 ±0.4 


2.88 


± 


0.15 


^ue and Wu (2000)' 


7.6 ± 1.2 


2.79 


± 


0.08 


Novicki et al. (2002) 


6.0 ±4.2 


2.82 


± 


0.43 


Ettori et al. (20021 


7.3 ± 1.8 


2.54 


± 


0.42 


"Bolometric X-ray luminosity is 




/6kcV)"^ 


LT 


with 



1/6 in units of 10** h.jQ erg s ^ . 
''Cores of clusters excised to avoid cool cores. 
'^Clusters without cool cores. 
'^Clusters with cool cores. 
'^Sample avoids clusters with cool cores. 



Tiuin is approximately a power law. Normalizing the 
relation at 6 keV, in the heart of the temperature 
range for rich clusters, leads to the expression Lx — 
-^eCTlum/SkeV)"^^, and Table HTl gives some represen- 
tative values of Lg and ult- Excising the central re- 
gions of clusters, out to about 100 kpc in radius, re- 
duces the scatter in the relation because cooling and non- 
gravitational heating processes affect the temperature 
and lumin osity of these regions dif f erently from c luster 
to cluster ("AUen and Fabian', '1998'; iFabian eA all . Il994t 
lMarkevitck.1998 : Voit et al, 2002). 

The power-law index of the Lx-21um relation clearly 
indicates that galaxy formation has affected the Lx-Tlum 
relation. If the density distribution of intracluster gas 
within r2oo were self-similar, independent of cluster mass, 
then one would expect bremsstrahlung emission to give 
ix (X PgMzooTiy^ cx teaiseil ll98l. The steep- 

ness of the observed power-law index indicates that non- 
gravitational processes have raised the entropy of the in- 
tracluster gas, making it harder to compress, particularly 
in the shallower potential wells of cool clusters. This ex- 
cess entropy therefore lowers the luminosities of all clus- 
ters by lowering the mean gas density and steepens the 
Lx-Tlum relation because the impact of excess entropy de- 
creases as cluster t emperature rises ijEvrard and Henrvl 
ll99lUKa,iseT(ll99lL Se dTyTl . 

Calibrating the mass-luminosity relation by cou- 
pling the mass-temperature relation with the observed 
luminosity-temperature relation leads, not surprisingly, 
to values of erg that are similar to those derived from 
the mass-temperature relation alone and are subject to 
the same systematic uncertainties that plague the mass- 
temperature calibration. There is, however, a route 
to the mass-luminosity calibration that circumvents the 
middle step involving the mass-temperature relation. 

The mass-luminosity relation can be calibrated more 
directly with high-quality X-ray imaging and tem- 
perature data on a c omple te sample of clusters. 
iReiprich and BohringeJ l)2002|) have done this with 



i?OS'^T imaging data and ASCA temperatures, finding 

T _ in45.0±0.3 ^-2 ,,.-\( ^^200 ^ 

^^-'^ W^h-,lMj ■ ^^^^ 

Their mass calibration assumes that the cluster gas is 
in hydrostatic equilibrium and obeys an isothermal beta 
model. The masses they derive are therefore higher 
than those one would find after correcting for a pos- 
sible negative temperature gradient at large radii but 
do not account for any turbulent pressure support. 
With this mass-luminosity relation, they find a clus- 
ter mass function whose normalization corresponds to 
as = 0.68(J7m/0.3)-°-38. 

Furthermore, because their ob served cluster sample ex- 
tends over two decades in mass, IReiprich and Bohringeil 
l|200d) attempted to break the ag-^u degeneracy by fit- 
ting the mass-function's shape with a CDM power spec- 
trum, finding a best fit of JIm = 0.12^0 04 and as — 
0.96t^;i^, with nu < 0.31 at the 3a level. The unusually 
low best-fit value of CIm arises because their derived mass 
function is shallower than that expected for J^m = 0.3. 
However, iPierpaoli et al have applied that same 

Lx-Mann relation to the larger REFLEX cluster sam- 
ple (^Bohringcr et al, 2002), finding as = 0.86t{];}^ and 
f^M = 0.23^Q jg. Results similar to these latter values 
are also found from the mass-luminosity relation when 
cluster evolution is used to break the as-^M degeneracy 

rsec rrrm . 



4. Mass-Richness Relation 

Optical telescopes have gathered much larger cluster 
samples than have X-ray telescopes, but deriving a mass 
function from these samples is not so straightforward. 
Projection effects complicate both the measurement of 
cluster mass and the computation of the sample volume 
associated with a given mass. Clusters in optical surveys 
are selected on the basis of richness, which depends on the 
number of galaxies observed within a certain projected 
radius from the center of the cluster fSec. III.A.1|I . Thus, 
even if optical luminosity traces mass exactly, galaxy con- 
centrations lying outside raoo but projected along the 
same line of sight can boost the apparent mass, intro- 
ducing non-gaussian uncertainties in the mass-richness 
relation. Likewise, the effective volume associated with 
a given cluster mass in an richness-selected survey is 
harder to quantify than in a survey with a definite X- 
ray flux cutoff. Nevertheless, when richness is rigorously 
define d, it correlates wel l with a cluster's X-ray prop- 
erties (|Yee and Ellingsonl l2003|l . However, the scatter 
between optical richness and X-ray luminosity is still 
large compared with the accuracy to which one would 
like to derive cosm ologica l paramet ers dPonahue et al l 
l200li l2002: Gilbank et all. 12003: Koc hanek et aZ.Ll2003|) . 

Measuring cluster masses purely on the basis of galaxy- 
count richness necessitates a different approach to defin- 
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ing a cluster's radius and therefore its mass. Because tra- 
ditional measures of richness depend on the radius within 
which galaxies are counted, they are defined with respect 
to a fixed physical radius, independent of mass, at each 
redshift. For this reason, observations of cluster richness 
are sometimes compared with simulations on the basis 
of cluster masses measured within a constant physical or 
comoving radius (Bode et al., 2001). Deriving a mass- 
richness relation from simulations of galaxy formation 
also involves an observational calibration of the cluster 
mass-to-light ratio, which according to equation (|59|) in- 
troduces a systematic uncertainty in trg that is ^40% of 
the uncertainty in the mass-to-light conversion. 

Making such a comparison with the early c lusters from 
the Sloan Digital Sky Survev. iBahcall et al\ (j2003 b'l find 
a mass-function normalization implying erg = 0.69 ± 
0.07(r2M/0.3)^'^'^. Adding mass-function shape informa- 
tion to break the degeneracy leads to IIm = O.lQ^Q gy 
and (Tg — 0.9^02 1 reasonably good agreement with the 
X-ray derived values. Unfortunately, because there is as 
yet no simple parametric form, analogous to the Jenkins 
mass function, giving the mass function defined with re- 
spect to a fixed radius, it is not clear how to self-calibrate 
the associated mass-richness relation to high accuracy 
with a large survey. 



l)l998|) the virial masses are 25% larger, on average, than 
the X-ray masses. Another factor that could contribute 
to this discrepancy is scatter in the Af2oo-o'iD relation. 
An underestimate of the scatter would drive up the in- 
ferred mass-function amplitude, raising the best-fitting 
value of (Tg . 

Part of the discrepancy between the optical and X-ray 
determined masses may stem from how velocity disper- 
sions are observed. Because ctid declines with projected 
radius, its observed value will depend on the cutoff ra- 
dius. Also, any foreground or background interlopers 
projected onto the cluster can contaminate the velocity- 
dispersion measurement. Ideally, one would like to cut 
off the measurement at a spherical boundary with radius 
''200 J inside of which the relation between dark-matter 
velocity dispersion and mass is well-calibrated, but the 
large number of galaxy velocities needed to accurately 
measure the mass profile near the virial radius make 
this approach impractical for large cluster samples. In 
a small sample of eight rich clusters. Rines et al. (200^ 
have used an average of almost 200 galaxy velocities per 
cluster, extending to well beyond r200; to measure the 
mass M200 within r2oo ■ The masses they find are consis- 
tent with both the X-ray determined masses and with the 
virial theorem including a surface-pressure correction. 



5. Velocity Dispersion and IVIass 

Velocity dispersion is the optical analog to X-ray tem- 
perature. Thus, one would expect a mass function de- 
fined on the basis of velocity dispersion to coincide with 
those defined with respect to cluster temperature. Mea- 
surements of rich clus ters indicate that g ^n — (1.0 ± 
0.1)fcB7ium//imp fe.g.. lXue and Wul. l200d Sec. III.B.2p . 
which reassuringly suggests that both qu antities accu- 
rately trace mass. On the other hand, lEvrard et al\ 
I2OO2) have pointed out that combining this relation 
with the observational calibration of the A/2oo-71um re- 
lation (Tium ~ l-2T2oo) leads to a puzzle. While it might 
be possible for non-gravitational effects associated with 
galaxy formation to boost Tium, it is more difficult to 
imagine why non-gravitational effects would boost the 
galaxy velocity dispersion by a similar factor. 

Mass functions derived from velocity dispersion mea- 
surements also suggest that the masses derived from 
those measurements are larger than those derived from 
X-ray data. Using the viri al theorem with a p ressure 
correction term (Sec. lII.A.2|) . lGirardi et derive 
a cluster mass function from velocity dispersions whose 
normalization indicates erg = (1.01 ± 0.07)(f}M/0.3)-'' '*3, 
implying an overall number density at a given cluster 
mass about two times larger than the X-ray measure- 
ments. A discrepancy in erg as large as 30% could arise 
if the optically determined masses were over 50% larger, 
but the act ual mass discrepancies a p pear n ot to be quite 
so large. iReinrich and Bohringed l|2002l) find that in 
the 42 clusters they have in common with lGirardi et al\ 



6. Weal< Lensing and IVIass 

Weak lensing is a very promising method for measuring 
cluster masses that is independent of a cluster's baryon 
content, dynamical state, and mass-to-light ratio. The 
main systematic problem in weak-lensing mass measure- 
ments comes from the lensing done by excess mass out- 
side the viri al radius but along the line of s igh t through 
the cluster ijHoekstraL l200lUMetzler et ad l2001. 1999). 
So far, weak-lensing's main contribution to cluster stud- 
ies has been to assist in the calib ration of other mass 
estimators fe.g.. lAllen et a^J . l200l|) . 

Techniques for compiling cluster samples selected on 
the basis of weak lensing are still in their infancy. Only a 
few clusters with confirmed spectroscopic r edshifts have 
been detected in wea,k lensing surveys (e.g., Dahlc et a/J, 
l2003HSchirmer et a/.l . l2003 IWittman et a/.l]2003. .20011) . 
However, deep optical surveys covering wide patches of 
the sky should turn up many more such clusters in 
the coming decade. In the meantime, smaller weak- 
lensing surveys sensitive to large-scale structure are com- 
plementing the cluster work because they provide val- 
ues of (Tg that are independent of the cluster measure- 
ments. Numbers currently in the literature span approx- 
imately the same range as those derived f rom clusters , 
going from erg = (0.72±0.08)(r2M/0.3)-°-5^ l|,Iarvis et all 
I2003D on the l ow en d to erg = (0.97 ± 0.14)(rjM/0.3)-° '''* 
l)Bacon et all l2003|) on the high end. 
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7. Baryons and Mass 

Yet another technique for measuring the cluster mass 
function rehes on the constancy of the ratio of baryons to 
dark matter in massive clusters. X-ray observations from 
Chandra indicate that the ratio of hot baryonic gas to to- 
tal gravitating matter within a given radius asymptoti- 
cally approach es (0.113 ± . 005)fe 7rf^^ in relaxed, high- 
mass clusters (lAllen et all 120021) . Correcting for the 

baryons in stars, whose mass is approximately O.IG/i^q^ 
times that of the hot gas (Sec. IIV.DI raises the overall 
ratio of baryons to dark matter in clusters to fb = 0.13 
for /170 = 1.0. 

This ratio is itself one of the best tools for mea suring 
n» (lAllen e± nJlWM \D^\ d et all ITflfl^t f Evra,rdL ll OTTt 
IWhite et alV Il993fl . No known hydrodynamic process 
can drive a large proportion of a rich cluster's baryons 
out of the cluster's deep potential well. Thus, the ratio of 
baryons to dark matter in a cluster is expected to be sim- 
ilar to the global ratio in the universe. Dividing the mean 
baryon density fib ~ 0.045 h^^ consistent wi t h bot h the 
abundances of light elements llBurles et al 1 I2001[) and 
microwave background fluctuations ijSDergele^IT' 20031) 
by the baryon fraction implies IIm 0.3. lAllene^^zl 



20021) find n 



M 



0.30 



+0.04 
-0.03 



after marginalizing over the 



uncertainties in Vl^ and Hubble's constant. 

One can also use ratio of baryons to dark matter to 
constrain dark energy (Pcii, .l_922i ..SaSAkij 1996) . Mea- 
surements of this ratio in clusters depend on the relation- 
ship between transverse size and redshift, which depends 
on both JIm and J^a (Sec. 1111. A. 2|l . If the actual ratio 
remains constant with redshift, then the measured ratios 
will be independent of redshift only if the co rrect values 
of VLm and VL\ are used in the measurement. lAllen et al\ 
|2004) have recently shown that the measured baryon to 
dark matter ratio in a sample of 26 clusters ranging up 
to z « 0.9 is consistent with the low-red s hift ratio for 
Ov = 0.94t°JJ (see also lAllen erall l2002t fEttori et all 
12003). However, the degree to which the actual ratio is 
redshift-independent is not yet known. 

If the ratio of baryons to dark matter were com- 
pletely independent of cluster mass and radius, then 
measurements of the baryon mass inside a radius con- 
taining a mean baryon density of 200/bPcr would di- 
rectly give M200- The cluster mass function could 
then be determined by measur ing the baryon masses 
within a given scale radius ijVikhlinin et alV l2003t 
IVoevodkin and VikhlinirJ . |2004() . In fact, the baryon 
fraction is not quite constant in clusters, probably owing 
to the same galaxy-formation effects that sh ift the M-T 
and L -T relations (Sec. IIV.T^ . For example. iMohr et al\ 
l|l999l) find that the ratio of gas mass to dark matter is 
°^ -^lum clusters cooler than about 6 keV and is 

statistically inconsistent with a constant value at the 99% 
level. Other studies concur that the proportion of hot gas 
in low-m ass clusters is smaller than that in high-mass 
cluste rs ijNeumann and ArnaudL I2OOII: ISanderson et alV 
120031) . 



After correcting fo r this effect, 

IVoevodkin and VikhlininI l)2004j) infer a cluster mass 
function from the baryon mass function signifying 
CT8 — 0.72 ± 0.04 for the assumed cosmology ($7 — 0.3, 
fiA — 0.7, and h — 0.71). Furthermore, the shape 
parameter F = 0.13 ± 0.07 of the mass function is consis- 
tent with the CDM power spectrum given the assumed 
values of VIm and h. Notice that this value of a% agrees 
with those derived from the observationally calibrated 
.^j'^200-T'ium and M200-LX relations, even though it does 
not explicitly rely on those calibrations. 



D. Evolution of the Mass Function 

Measurements of evolution in the cluster mass function 
can considerably tighten all these constraints on cosmo- 
logical parameters. What we actually observe, of course, 
is the dependence on redshift of the observables that trace 
the cluster mass function. For a given cluster sample we 
can measure the number of clusters dN within a given 
solid angle dfl and redshift interval [z,z + dz] that fall 
into the range [X, X + dX] of the observable X. With 
full knowledge of the mass-observable relation M{X, z) 
and its scatter as a function of redshift, we could then 
derive the redshift distribution 



d^N 



dM dfl dz 



(M,z) 



dn 



M 



dM 



{M,z) 



fVco 

dz dQ 



(z) (61) 



for clusters of mass M directly from the observations. 
This distribution of clusters with redshift would then pro- 
vide strong constraints on cosmological models through 
both the mass-function evolution factor duM/dM and 
the comoving volume factor d'^Vco/dfldz from equation 

As the reader probably suspects by now, our ability to 
constrain cosmological parameters through the redshift 
distribution of clusters is currently limited by our under- 
standing of evolution in the mass-observable relations. 
However, this problem is not as severe as one might ex- 
pect because the evolution in the mass function itself is so 
dramatic, especially for Qm ~ 1- This part of the review 
discusses what we have learned about structure formation 
and cosmological parameters by observing cluster evolu- 
tion. It begins with a description of how mass-function 
evolution depends on cosmological parameters and then 
considers the complications arising from evolution of the 
observables themselves. It concludes with a summary of 
current constraints on from cluster evolution and a 
look at the prospects for measuring ilA and w with large 
cluster surveys. 



1. Dependence on Cosmology 

Evolution of the mass function is highly sensitive 
to cosmology because the matter density controls the 
rate at which structure grows. When the mass func- 
tion can be expressed in terms of formulae like H52|l . 
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(|54(l . or (|55|l . its evolution is controlled entirely by 
the growth function D{z), which is a well defined 
function of JIm, ^a, and w (Sec. IIII.B.4|I . Small- 
amplitude density perturbations grow as D{z) — (1 + 
z)~^ when flMiz) ~ 1, but perturbation growth stalls 
when fluiz) <C 1. This effect manifests itself most 
strongly in high-mass clusters because they are the lat- 
est objects to form in a hie rarchical cosmology with 



est objects to lorm m a nie rarcnicai cosmology wit 
a CD M-like power spectrurn (lEke et all Il99^ lEyrari 
19891 lOukbir and Blancha^ 119921: iPeebles et all Il98 



Viana and Liddld.ll996|r The exponential dependence of 
the mass function on a{M,z) = D{z)a{M,0) makes the 
effect quite dramatic for objects sufficiently massive that 
a{M,0) < 1. 

Dependence of the mass function on f2A and w is a lit- 
tle more subtle. These parameters affect mass-function 
evolution by altering the redshift at which flM{z) de- 
parts significantly from unity for a given value of f^M at 
z = (Haiman et ai, 2001,) . The time at which dark 
energy begins to dominate the dynamics of the universe 
is later for both larger values of and smaller (more 
negative) values of w (see Figure leading to greater 
evolution of the mass function between z ^ 1 and the 
present (tBattve and Welle'rl 120031: FWang and Steinhardd 

Measurements of how the mass function changes with 
redshift can provide additional information about and 
w through the expansion rate of the universe. If the 
mass function of clusters is precisely known, then number 
counts of clusters exceeding a given mass in each redshift 
interval dz reveal the volume associated with that red- 
shift interval and can be used to determine the dynamics 
of the universe's expansion. The number of clusters with 
mass > M on the celestial sphere in the redshift interval 
dz is given by 



dN 
dz 



(M) 



H{z) 



nM{M,z) 



(62) 



Figure El shows this number-redshift distribution for sev- 
eral different cosmological models. Notice that the sta- 
tistical power of cluster surveys is ultimately limited by 
the total number of massive clusters in the observable 
universe, which is of order 10^. 
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FIG. 6 Predicted number of clusters on the sky as a func- 
tion of redshift in different cosmologies. The upper panel 
shows the number of clusters per unit redshift with M200 > 
3 X 10" hro Mq over the entire sky. Notice that there are a 
few tens of thousands of such clusters on the sky in mod- 
els with r^M = 0.3, most of them at z < 1. There are 
many fewer massive clusters ai z > 0.5 in the rCDM model 
with 57m = 1 because cluster evolution is so rapid in that 
case. The lower panel shows the numbers of clusters with 
Af2oo > 1 X 10^^ hvo Mq. Differences between models with 
r^M ~ 0.3 but differing values of 51a and w should be de- 
tectable in large cluster surveys containing ~ 10* clusters and 
extending to z ~ 1. 



2. Evolution of the Observables 

All of the mass-observable relations discussed in 
Sec, nil. d evolve with redshift, partly because the defini- 
tion of A/200 is pinned to the critical density and partly 
because of galaxy-formation physics. Clusters of a given 
mass are hotter earlier in time because their matter den- 
sity is larger; both T200 and the square of the dark-matter 
velocity dispersion for a fixed value of A/200 vary with 
redshift as i/^/^(z) fSec. IIII.C?2|) . One therefore expects 
Tium and the square of the galaxy velocity dispersion to 
depend on redshift in the same way, but it is possible that 
the physics of galaxy formation adds additional redshift 



evolution that must be accounted for in precise cosmo- 
logical measurements. Galaxy formation plays a more 
explicit role in the mass-richness and M200-LX relations, 
because the optical luminosities of galaxies evolve with 
time and because the physics of galaxy formation alters 
the Lx-Tx relation (Sec. lIV.Cp . Scatter in the mass- 
observable relation might also be larger at higher red- 
shifts, given that the proportion of relaxed clusters may 
well be smaller earlier in time. 

As an example of how mass-observable evolution af- 
fects observations of mass-function evolution, consider 
its effects on X-ray surveys. The upper left of Figure [7| 
shows how the cluster mass function evolves for two dif- 
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ferent cosmologies, a standard ACDM model (JIm = 0.3, 
= 0.7, w = -1, as = 0.9) and a rCDM model 
i^M = 1.0, = 0.0, 0-8 = 0.5, r 0.21) whose power 
spectrum has been adjusted by hand so that its shape 
is similar to that of the ACDM model, as required by 
observations of large-scale structure fSec. IIII.T?! . Mass- 
function evolution is quite pronounced in both models 
but is far stronger in the tCDM model. 

Evolution in the mass-temperature relation weakens 
the observed amount of cluster evolution when cluster 
number density is plotted as a function of temperature. 
The upper right of Figure [T] shows the result of using 
a M2oo-71um relation with Tium/22oo = 1 and zero dis- 
persion. Because clusters of a given mass are hotter at 
earlier times, the higher redshift curves have translated 
to higher temperatures, compared with their positions in 
the upper-left panel. Additional mass-temperature evo- 
lution exceeding that predicted by the virial theorem and 
corresponding to values of rium/T2oo that increase with 
redshift, would further reduce the evolution, but there is 
currently no evidence for such evolution. 

Redshift-dependent changes in the luminosity- 
temperature relation can have additional evolution- 
softening effects. The Lx oc Tj^j^^ power-law form of 
the relation appears to remain the same with redshift, 
but the amount of evolution in the normalization is 
uncertain. Early assessments suggested no evol ution in 
the normalization (Borgani et al, 1999; Donahu e et al\ . 
Il999t iMushotzkv and ScharA (19971) • The lower right of 
Figure {7\ shows the evolution of cluster number density 
plotted against luminosity for a non-evolving normal- 
ization and Lx = 6 X h^^^^ eTgs'^{Ti^m/&^eVf, 
again with no dispersion. These curves differ from the 
temperature-function curves only in the labeling of 
the horizontal axis. More recent results indicate that 
higher-redshift clusters of a given temperature are more 
luminous, with an evolv ing relation Lv oc Tinmfl + z)''^^ 
where 0.5 < Blt < 1.5 ("Et tori et aO , '2003^ "Lurnb et all 
[2003; Vikhlinin et at, 2002i see Sec. llV.C.4t . Figure [7| 
shows the same distribution functions for Blt — 1.5, at 
the high end of the suggested range. The extra redshift 
dependence in this case slides the high-redshift curves 
even further to the right, roughly compensating for all 
of the evolution in the underlying mass function. 

These examples underscore the importance of con- 
straining evolution in the mass-observable relations, even 
if the observables could be perfectly measured. In addi- 
tion, one must bear in mind that the observations them- 
selves can introduce spurious redshift dependences in the 
mass-observable relations, largely because distant clus- 
ters are more difhcult to observe than nearby ones. Opti- 
cal projection effects become progressively harder to deal 
with at high redshift, complicating observations of rich- 
ness and velocity dispersion, observations of weak lensing 
have fewer background galaxies to measure, and the de- 
cline in X-ray surface brightness makes cluster temper- 
ature measurements more difficult. In many ways, the 
Sunyaev-Zeldovich effect is the most promising observ- 



able for characterizing high-redshift clusters because its 
magnitude does not depend on redshift fSec. Ill.C)) . 

There are three basic ways to deal with evolution in 
the normalization and perhaps the scatter of a mass- 
observable relation: 

• Assume a model for the evolution of the relation. 
Numerical simulations can be very helpful in pro- 
viding a model for evolution of the normalization 
and scatter of mass-observable relations but give 
accurate results only if they include all the relevant 
physics. 

• Assume a parametric form for the mass-observable 
relation inspired by theoretical models and try to 
calibrate it directly with observations. The normal- 
ization of the relation is usually assumed to be pro- 
portional to either (l + z) or H{z) raised to a power 
determined by a fit to observations. In practice, 
however, the mass-observable relations for distant 
clusters are not directly calibrated. What we have 
instead are relations that link one easily observed 
quantity, such as X-ray luminosity, to another that 
is more closely related to mass, like X-ray temper- 
ature or the weak-lensing distortion. 

• Assume a parametric form for the mass-observable 
relation and apply self-calibration techniques to a 
large cluster survey to find the most likely pa- 
rame t ers describing mass-o b servable evolution llHul . 
12003'; 'Levine et all l2002t iMaiumdar and Mohil 
12003, 2004). Parameters involving redshift- 
dependent scatter in the relation can also be in- 
cluded in such an analysis. This technique is very 
promising but requires large surveys of distant clus- 
ters which are not yet in hand. Its accuracy is lim- 
ited by the number of free parameters needed to 
describe the mass-observable relations — the fewer, 
the better. Having a realistic physical model for 
mass-observable evolution helps boost the accu- 
racy achievable with self-calibration by reducing 
the number of unknown parameters. 

A decade from now, when much larger cluster samples 
will be available, self-calibration will probably be the best 
way to calibrate the mass-observable relations. In the 
meantime, it would be wise to spend some effort on direct 
observational calibrations through cross-comparisons of 
multiple mass-tracing observables. 



3. Constraints on Dark Matter 

Surveys of distant clusters find modest evolution in 
their comoving number density fully consistent with cos- 
mological models in which JIm ~ 0.3. Because the rate of 
mass-function evolution at moderate redshifts {z ^ 0.5) 
is governed primarily by the overall matter density, this 
conclusion does not depend strongly on the value of J^a- 
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FIG. 7 Evolution of the cluster mass function and its manifestations in temperature and luminosity. The two models in this 
comparison are ACDM {Qm = 0.3, Qa = 0.7, w = -1, era = 0.9) and rCDM {Qm = 1.0, T = 0.21, as = 0.5). Evolution 
of the mass function, shown at the upper left, is fax more pronounced in rCDM (dashed lines) than in ACDM (solid lines) 
because it is so sensitive to the current matter density. Each set of three lines shows the differential mass function duM jd In M 
at 2: = 0, 0.5, and 1.0, from top to bottom, and black squares show the value of the mass function at a fiducial mass of 
10^^ K^Q Mq. The upper right panel shows the same mass functions plotted against temperature, assuming T = T2oq{M2oo,z). 
Notice that the higher- redshift curves have shifted to the right, weakening the evolution in temperature space, because clusters 
of a given mass have higher temperatures at higher redshifts. In order to convert these curves to temperature functions, one 
would need to convolve them with the scatter in the mass-temperature relation and multiply by dlnM/dlnT w 1.5. The 
lower two panels show these same curves as a function of luminosity, assuming Lx ~ (6 X lO^^ft^f ergs-i)T2^oo at z = 
and two different redshift dependences of the Lx-T relation. In the case without Lx-T evolution on the left-hand side, the 
curves are just relabeled versions of the ones in the upper-right jjanel. However, the strong Lx-T evolution in the right-hand 
panel (Lx oc T^{1 + z)^'^) shifts the three curves in the ACDM case nearly on top of one another at Lx ~ 10'*'* h^Q eTgs~^ . 
Convolving these curves with the dispersion in the mass- luminosity relation and multiplying by dlnM/dlnL fs 0.5 converts 
them to luminosity functions. 
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FIG. 8 Observed evolution in the integrated cluster temper- 
ature function n(> kT) giving the comoving number density 
of clusters with temperatures exceeding kT. Open circles give 
the low-redshift temperature function from a sample of clus- 
ters with mean redshift z = 0.051. Filled circles give the ob- 
served temperature function of clusters with a mean redshift 
of z = 0.429. The data points in each case are correlated be- 
cause n(> kT) at a given temperature is a cumulative function 
depending on all data points at higher temperatures. Lines 
give the predicted temperature functions at 2 = 0.051 (dot- 
ted line) and z = 0.429 (solid line) for the best-fitting model: 
Qm = 0.28, Qa = 0.98, as = 0.68. (Figure courtesy of Pat 
Henry.) 

Here we focus on the constraints on J^m derived from X- 
ray surveys, whose observables — Lx, Tx, and baryonic 
mass — are related to to the spherical mass M200 through 
simple parametric relations. 

Evolution i n the X-ray temperature function was first 
observed by Henrvll)l997j) . who showed that the comoving 
number density of ~ 5keV clusters at z ~ 0.35 was only 
slightly smaller than it is today. Assuming standard evo- 
lution of the mass-temperature relation, Eke et al. (1998) 
derived matter-density constraints JIm = 0.38 ± 0.2 for 
= 1 - r^M and Q.m = 0.44 ± 0.2 for f^A = from 
these data using a maximum likelihood analysis to take 
full advantage of the sparse temperature data. More 
conservative analyses that simply counted clusters hot- 
ter than a given temper a ture found w eaker constraints 
l|Viana and Liddld.ll999|) . iHenrvl (|200(f) provides a com- 
plete discussion of the cluster temperature data and the 
maximum-likelihood analysis technique. 

Temperature measurements of a handful of hot clus- 
ters at higher redshifts have shown that the rate of 
cluster te mperature evolut i on remains modest at higher 
redshif ts llDonahud. Il996t iDonahue etdl Il998l Il999t 
IHenrvl. 12000^ The comoving number density of ^ 8 keV 
clusters at z '--^ 0.5 — 0.8 is no less than about one 
tenth of its current value, in strong disagreement with 



the standard expectation in an J^m = 1 universe (see 
Figure O. Including these hot, distant clusters in the 
analysis further strengthens the constraints on the mat- 
ter density, ruling out VIm = 1 at the 3cr level for 
standard mass-temperatu r e evol u tion llBa hcall a nd FanL 
1998t IDonahue and Voil 1X9991: iDonahue et all Il998t 
Evrard et 0.11 12002^ . In order for such hot clusters to 
exist in a flat, matter-dominated universe, the mass- 
temperature relation would have to evolve in a non- 
standard way, with an increa se in either the scat ter or 
the normalization at z > 0.5. lEvrard et al\ ()2002() have 
shown that the rCDM mass function of Figure [7| is con- 
sistent with the temperature function observations only 
if the mass-temperature normalization factor Tium/?200 
is ~1.5 times higher at z ~ 0.5 than at present. Such 
a big change seems unlikely in light of alternative ob- 
servations of these hot, high- redshift clusters that agree 
with the large masses inferred from the standard normal- 
ization (Do nahue et al., 1998; Luooino and Gioia, 199^ 
iTran et adll999j) . 

Observations of evolution in the X-ray luminosity func- 
tion have greater statistical power because many more 
clusters have known luminosities than have known tem- 
peratures, but uncertainties in luminosity-temperature 
evolution dilute the constraints they place on J^m ■ Many 
X-ray surveys have shown that the comoving number 
density of clusters at a given luminosity changes very 
little from redshift z ~ 0.8 to the present for Lx ^ 
lO^^ergs"^; significant evolution is seen o nly for clusters 
with Lx > 10^^ ergs-i ijMullis et aZ.1. 12004 . Rosati et all 
I2002D . Evolution this mild is generally expected in cos- 
mological models with JIm ~ 0.3. Strong evolution in 
the luminosity-temperature relation must occur in mod- 
els with i^M = 1 for the observed evolution in the lumi- 
nosity function to be so weak (see Figure^ . An extensive 
analysis by Borgani et al. (2001) of luminosity- function 
evolution in the i? 05*^ T" Deep Cluster Survey, which ex- 
tends to z ~ 1, indicates that SIm — 0.35to io, where the 
error bars signify the \a confidence interval. Models with 
r^M — 1 fall outside the 3(t confidence interval, even when 
the normalization of the luminosity temperature relation 
is allowed to vary with redshift as Lx c>c rium(l + z). 

The evolution of the baryon mass function ob- 
served with X-ray telescopes agrees with the conclusions 
drawn from the luminosity and temperature functions. 
Vikhlinin et al. (2003) have measured the baryon mass 
function in a sample of clusters at z 0.5, finding that 
the comoving number density of massive clusters at that 
redshift is roughly one tenth of the current value. This 
result implies Q.m — 0.25 ± 0.1 (Ict confidence interval) 
for r^A = 1 — f^M- 

Optical studies concur that cluster evolution has been 
relatively modest since z ^ 0.5, buttressing the conclu- 
sion that Om < 1. In fact, the evolution of optically se- 
lected clusters appears even milder than the evolu tion in 
X-ray selected clusters (e.g.. lPostman et all . l2002() . which 
would imply an even smaller value of ^m- However, it 
is not yet clear how much of this discrepancy arises from 
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FIG. 9 Observed evolution in the cluster luminosity function. 
Many different cluster surveys spanning the range z ^ 1 
are shown on this figure. The vertical axis gives the lu- 
minosity functions derived from these surveys in terms of 
<j!> = dn/dLx, and the shaded region shows the luminosity 
function at z ~ 0. Significant evolution is seen only at 
Lx ^ 10*^ erg s~^ , consistent with ACDM models with a mod- 
erate amount of evolution i n the Lx-T relation (see Figure|7|) 
(Figure from lMuUis et al\ J2004l .') 



differences between the projected masses measured by 
optical surveys and the spherical masses measured by X- 
ray surveys. 



4. Constraints on Dark Energy 



mation, as long as these surveys are large enough to per- 
mit self-calibr a tion of the mass-ob s ervabl e relationships 
llHolder et al\. Homl iT.evine ,.t all ^QQ^ IWeller et al\. 
mOTi. The accuracy achievable with self-calibration de- 
pends critically on the nature of cluster evolution, be- 
cause the self-calibration procedure requires evolution of 
the relevant mass-observable relation to be expressed in 
a parametric form. Constraints on the cosmological pa- 
rameters are considerably weaker if the actual evolution 
does not follow the assumed parametric form. However, 
cross-calibration of mass-observable evolution through 
intensive supplementary observations of a small subset 
of the large survey res tores much of the pote ntial inher- 
ent in self-calibration (|Maiumdar and Mohii 12003'). 

Including information on cluster bias inherent in a 
large cluster survey further tightens the constraints on 
dark energy. Because the tendency of clusters to clus- 
ter with one another depends in a simple way on the 
cosmological model (Sec. IIII.B.5|) . folding this informa- 
tion into the self-calibration procedure improves the ac- 
curacy with_which_£osm£logic^^ can be mea- 
sured ijMaiumdar and Mohr , l2004f) . FigurelTUIshows how 
the estimated constraints on and w tighten when in- 
formation about cluster bias is added. It assumes that 
the universe is flat {Q\ = 1 — flu) and considers three 
different planned cluster surveys: two large Sunyaev- 
Zeldovich surveys (SPT and Planck) and a large X-ray 
survey (DUET), eac h of which will find 20 , 000 to 30,000 
cluster to z ^ 1 (see lMaiumdar arid Mohii 12004 for de- 
tails). In the most optimistic cases, the parameters flyi, 
r^Aj and w will be measured with ~5% accuracy. 



Existing cluster surveys, taken by themselves, do not 
yet place strong constraints on dark energy, but that sit- 
uation is likely to change in the coming decade, with the 
advent of large, deep cluster surveys in the optical, X- 
ray, and microwave bands. Currently, the most interest- 
ing information that clusters provide about dark energy 
comes from combining the results of cluster surveys with 
other information. If the overall geometry of the uni- 
verse is indeed flat, as seems quite evident from the tem- 
perat ure patterns in the cosmic microwave background 
(e.g., ISpergel et all |2003() . then the matter density in- 
ferred from clusters implies = 1 — = 0.7 ±0.1, 
in agreement with measurements of n\ from the su- 
pernova magnitude-red shift relation l)Perlmutter et o,l\ . 
I1999I: iRiess et a/.LIl998|) . Geometrical arguments involv- 
ing clusters provide weaker support for this conclusion. 
If the baryon fraction of clusters at a given temperature 
is assumed to remain constant with time, then the trans- 
verse sizes of clusters as a function of redshift can be 
used to constrain the geometry of the u niverse. Stud- 
ies using such methods disfavor f^A = l|Arnaud et all 
l20fi3lMohr et~:d\.Mi(h. 

Large cluster surveys extending to 2 ^ 1 have the po- 
tential to place much stronger constraints on the dark- 
energy parameters VIa and w, independent of other infor- 



A large survey also minimizes the sample vari- 
ance that arises from cluster bias vra rd et al\ . l2002t 
IHu and Kravtsovl 12003 ). Because clusters tend to be 
clustered, the variance in the number of clusters within 
small sample volumes is larger than the gaussian expec- 
tation, adding systematic uncertainty to the measured 
mass function. This effect is generally not large for cur- 
rent cluster surveys but should be taken into account 
if one is designing a cluster survey for making high- 
precision cosmological measurements. 

In summary, observations of cluster evolution already 
constrain the density of gravitating matter to be Qm ~ 
0.3 ± 0.1, meaning that J^a ~ 0.7 ± 0.1 if the universe 
is flat. Using this value of flu to break the flu-cs 
degeneracy leads to erg w 0.7 — 1.0, depending on the 
mass-temperature calibration. The major source of un- 
certainty in all these cosmological parameters comes not 
from the statistics of the survey but rather from uncer- 
tainties in the normalization and rate of evolution in the 
mass-observable relations. In order to better understand 
these relations and how they evolve, we need to know 
how galaxy formation affects the evolution of the stuff 
we can observe — the baryons in clusters. That is where 
we turn our attention next. 
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FIG. 10 Expected constraints on cosmological parameters 
from self-calibrated surveys. The SPT and Planck surveys will 
find 20,000 to 30,000 clusters through the S-Z effect. DUET 
is a proposed X-ray survey designed to find ~ 20, 000 clusters. 
Dotted contours show the expected constraints on w and Q.m 
from self-calibration if redshift evolution of the cluster ob- 
servables behaves exactly according to the standard scaling 
relations, allowing the redshift dependences of those scaling 
relations to be fixed. Long-dashed contours show how the 
constraints loosen when redshift evolution is determined as 
part of the self-calibration. Dot-dashed contours show how 
the contraints begin to tighten when information about clus- 
ter bias is included in the calibration. Solid contours show the 
best-case scenario in which the self-cailbration includes both 
information about cluster bias and supplementary follow-up 
calibrations of a small subset of the survey. (Figure from 
iMaiunidar and Mohr (,2004 '). 1 



IV. EVOLUTION OF THE BARYONIC COMPONENT 

Those who are cosmologists at heart are interested in 
how galaxy formation affects the intracluster medium pri- 
marily because they would like to know how to mea- 
sure cluster masses more accurately. Those who are 
astronomers at heart are interested in the intracluster 
medium as well, but for them its main attraction is that 
the hot gas contains valuable information about the phys- 
ical processes that govern galaxy formation. Clarifying 



the connections between galaxy formation and the mass- 
observable relations is therefore important to both of 
these lines of research. 

One of the nagging mysteries in our current picture 
of the universe is why so few of the universe's baryons 
have turned into stars llColel . Il99ll: IWhite and FrenkL 
Il99lt IWhite and Reesl [1.9 78*). Numerical simulations of 
cosmological structure formation that include baryonic 
hydrodynamics and the radiative cooling processes that 
lead to galaxy formation predict that > 20% of the 
baryons should have condensed into galaxi es, but ^ 10% 
have b een found in the form of stars (e.g.. lBalogh et all . 
l2001bD . Some form of feedback, involving supernovae 
and perhaps outflows from active galactic nuclei, seems 
to have stymied condensation, but we are still largely 
ignorant about how this feedback works. 

Galactic winds like those observed from nearby star- 
burst galaxies, in which multiple clustered supernovae 
are driving the powerful outflows, are likely to be im- 
portant in regulating early star formation, but obser- 
vational constraints on the mass and energy flux in 
such winds are sketchy at best (Heckrn ai^. .2002; Marti 



199S) , particularly at hi gh red shift ifAdelberger et all 
20031 iPettini et all I200ll l2000|) . These galactic winds 
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presumably had a dramatic impact on the intergalac- 
tic medium and subsequent galaxy formation, with ef- 
fects that may have pe r sisted until t he prese nt day 
(e.g., iBenson and Madaul. l2003t [Oh and Bensonl [2003). 
Likewise, quasars and other forms of activity driven 
by black-hole growth in the nuclei of young galaxies 
may also have produced powerful outflows with last- 
ing consequences for the intergalactic gas, but the en- 
ergy input from these objects is still highly uncertain 
lllnoue and Sasakll200lUNath and Rovchowdh'^120021: 
IScannanieco and Ohi l2004H Voii'. '1994 119961) . 

Unfortunately, the low-redshift intergalactic medium, 
where most of the universe's baryons are thought to re- 
side, is notoriously hard to observe. Because the ma- 
jority of this gaseous matter remains undetected, it is 
sometimes referred to as the "missing baryons" (e.g. 
ICen and OstrikeI^ll999^ . A handful of quasars are bright 
enough beacons for probing the missing baryons via 
absorption-line studies with the ultra violet spectrographs 
on the Hubble Sp ace Telescope (e.g.. lPenton 
IShull et oil Il996() . and that number will increase if the 
Cosmic Origins Spectrograph is installed on Hubble. 
However, the inferences drawn from such studies de- 
pend critically on the uncertain heavy-element abun- 
dance and ionization state of these intergalactic clouds 
(e.g., ShuU et ai, 2003; Trioo et ai, 2000). 

Clusters of galaxies are still the only places in the 
universe where we have anything approaching a com- 
plete accounting of intergalactic baryons, their thermal 
state, and their heavy-element enrichment. Thus, ob- 
servations of the intracluster medium can provide unique 
insights into the cooling and feedback processes that gov- 
ern galaxy formation. In order to interpret the signatures 
of galaxy formation in the intracluster medium we need 
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to understand how the thermodynamic properties of to- 
day's clusters are linked to the physics of the intergalactic 
baryons at z ^ 2, the epoch of galaxy formation. 

This section of the review discusses the current under- 
standing of the interactions between galaxy formation 
and the intracluster medium, focusing in particular on 
how those interactions affect the mass-observable rela- 
tions so crucial to cosmology. It begins by outlining the 
properties that clusters would have if radiative cooling 
of the universe's baryons and subsequent galaxy forma- 
tion were suppressed. Because these properties do not 
agree with observations, radiative cooling and galaxy for- 
mation must somehow have altered the structure of the 
intracluster medium, with important consequences for 
the mass-observable relations. The middle of this sec- 
tion summarizes some of the recent progress that has 
been made in understanding the role of galaxy formation 
and its impact on the observable properties of clusters. 
It then concludes with a brief discussion of the existing 
constraints on baryon condensation in clusters. 



A. Structure Formation and Gravitational Heating 

People who study clusters of galaxies are sometimes 
asked how the X-ray emitting gas gets so hot. The an- 
swer to that question is simple. If radiative cooling is 
negligible, then gravitationally driven processes will heat 
diffuse gas to the virial temperature of the potential well 
that confines it. A tougher question would be to ask why 
the intracluster medium has the density that it does. In 
order to answer that question, one needs to know what 
produces the entropy of the X-ray emitting gas. Without 
galaxy formation in the picture, shocks driven by hierar- 
chical structure formation are the only source of entropy 
for the intracluster medium, and this mode of entropy 
production leads to clusters whose density and tempera- 
ture structures are nearly self-similar. 



1. Intracluster Entropy 

Entropy is of fundamental importance for two reasons: 
it determines the structure of the intracluster medium 
and it records the thermodynamic history of the clus- 
ter's gas. Entropy determines structure because high- 
entropy gas floats and low-entropy gas sinks. A cluster's 
intergalactic gas therefore convects until its isentropic 
surfaces coincide with the equipotential surfaces of the 
dark-matter potential. Thus, the entropy distribution of 
a cluster's gas and the shape of the dark-matter poten- 
tial well in which that gas sits completely determine the 
large-scale X-ray properties of a relaxed cluster of galax- 
ies. The gas density profile Pg{r) and temperature profile 
T{r) of the intracluster medium in this state of convec- 
tive and hydrostatic equilibrium are just manifestations 
of its entropy distribution. 

This review adopts the approach of other work in this 



field and defines "entropy" to be 
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(63) 



The quantity K is the constant of proportionality in the 
equation of state P — Kp^J^ for an adiabatic monatomic 
gas, and is directly related to the standard thermody- 
namic entropy per particle, s = fee In if '^/^ -I- sq, where 
So is a constant that depends only on fundamental con- 
stants and the mixture of particle masses. Another quan- 
tity frequently called "entropy" in the cluster literature is 

2/3 

S — k-oTue ■ In order to avoid confusion with the clas- 
sical definition of entropy, we will call this quantity K^. 
For the typical elemental abundances in the intracluster 
medium, one can convert between these definitions using 
the relation 



= 960keVcm2 



(64) 



K 



lO-^"* ergcm^ g ^/-^ 



A cluster achieves convective equilibrium when dK / dr > 
everywhere, and the entropy distribution that deter- 
mines the gas configuration in this state can be expressed 
as K{Mg), where the inverse relation Mg{K) is the mass 
of gas with entropy < K. 

Comparisons between the entropy distributions of clus- 
ters that differ in mass can be simplified by c asting those 
distri butions into dimensionless form (e.g., IVoit et oZI . 
|2002() . Combining the mean density of dark matter 
within the scale radius ^200, the global baryon fraction 
fb = Qh/^M, and the characteristic halo temperature 
^200 gives the characteristic entropy scale 
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For fb — 0.022/1 ^, this entropy scale reduces to 
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Writing the entropy scale in this way makes explicit the 
fact that the observed temperature of a cluster is not 
necessarily a reliable guide to the characteristic entropy 
-^''200 of its halo. If the intracluster medium of a real 
cluster is either hotter or cooler than T200, then one must 
apply the correction factor T2oo/Tium when computing 
the cluster's value of K2oo- 



2. Entropy Generation by Smooth Accretion 

One way to approach the problem of gravitation- 
ally driven entropy generation is through spherically 
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symmetric models of smooth accretion, in which gas 
passes t hrough an accretion shock as it enters the clus- 
ter (e.g..lKnight and PonmanL Il997l: iTozzi and Normail 
l200ltlVoit et al\ . \200^ . If the incoming gas is cold, then 
the accretion shock is the sole source of intracluster en- 
tropy. If instead the incoming gas has been heated before 
passing through the accretion shock, then the Mach num- 
ber of the shock is smaller and the intracluster entropy 
level reflects both the amount of preheating and the pro- 
duction of entropy at the accretion shock. 

Let us first consider the case of cold accreting gas, in 
which the pressure and entropy of the incoming gas are 
negligible. Suppose that mass accretes in a series of con- 
centric shells, each with baryon fraction /b, that initially 
comove with the Hubble flow as in the spherical collapse 
model of Sec. IIII.B.ll In this simple model, a shell that 
initially encloses total mass M reaches zero velocity at 
the turnaround radius rta and falls back through an ac- 
cretion shock at a radius in the neighborhood of the 
virial radius rta/2. 

Because the cold accreting gas is effectively pressure- 
less, the equations that determine the postshock entropy 
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where Mg — fbM is the gas accretion rate, pi is the 
preshock gas density, T2 and p2 are postshock quantities, 
and the accretion radius has been set to rac = 7'ta/2. 
Equations (|69() and H70() are restatements of the jump 
conditions for strong shocks, assuming that the post- 
shock velocity i s negligible in the cluster rest frame (e.g., 
ICavaliere et all . Il997t iLandau and Lifshitzl. Il959|) . and 
equation is exact only for cosmologies with f^A = 0. 

The postshock entropy produced by smooth accretion 
of cold gas at time t is therefore 
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where ij = Mg{t) / fbM2oo{to) is effectively a radial coor- 
dinate corresponding to the amount of gas accreted by 
time t divided by the amount accreted by the present 
time to. Given these assumptions, the entropy proflle 
arising from smooth accretion of cold gas depends en- 
tirely on the mass accretion history M(t), and the pro- 
files of objects with similar accretion histories should be 
self-similar with respect to K2oo. 

This simple model yields entropy distributions 
whose overall sha pe agrees with cluster observations 
l)Voit et all l2003() . The rate at which a cluster ac- 
cretes matter through hierarchical structure formation 
depends on the growth function D(t) and the power- 
law slope aM = din a^^/ din M of the perturbation 
spectrum on the mass scale of the cluster: Mjt) oc 
\D(t) V^^°"" l|Lacev and Colei Il993t IVoit and Donahuel 
Il998|) . Clusters ranging in mass from IO^Mq to IO^^Mq 
grow roughly as M(t) (x t to M(t) oc in the con- 
corda nce model (jTozzi and Normanl I2OOII: IVoit et aUi 
l2003j) . Plugging these growth rates into equation l(7^ 
leads to entropy distributions between K cx Mg and 
K cx Mg^^. Throughout much of a cluster, the gas mass 
encompassed within a given radius rises approximately 
linearly with radius fSec. lII.B.ip . meaning that the K{r) 
relation should be slightly steeper t han linear. Numer- 
ical m odels of smooth accretion by iTozzi and Normanl 
(I2OOII) find K{r) (x r^'^. The entropy profile observed 
outside the core regions of clust ers also obey K{r) cx r^-^ 
(|Pratt and ArnaudLl200ll2003|) . but the extent to which 
this agreement is coincidental is not clear. 

If the accreting gas is not cold, then the intracluster 
entropy profile produced by smooth accretion has an isen- 
tropic co re with an entropy level similar to the p r eshoc k 
entropy l|Balogh et all . Il999t iTozzi and Normanl l200l[) . 
A non-zero initial entropy level Ki changes the cold- 
accretion model outlined above by altering the jump con- 
ditions represented by equations and (|7n|) . When Ki 
is no larger than the entropy generated at the accretion 
shock, then the entropy profile created by smooth accre- 
tion of warm gas can be closely approximated by adding 
0.84i^i to the entropy proflle Kf, m(Mn) fro m the cold- 
accre tion case ijPos Santos and P ore. 2002 IVoit et all . 
|2003|) . If Ki is large compared with Ksm, then the ac- 
cretion shock is weak or non-existent, and accretion is 
nearly adiabatic, leading to an isentropic entropy proflle 
with the constant value Ki. 



Because the entropy generated at the shock front in- 
creases monotonically with time, such an idealized cluster 
never convects but rather accretes shells of baryons as if 
they were onion skins. The resulting entropy distribution 
in dimensionless form is 
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3. Entropy Generation by Hierarchical IVIerging 

In real clusters the accreting gas is lumpy, not smooth, 
which transforms the nature of entropy generation. In- 
coming gas associated with accreting sublumps of matter 
enters the cluster with a wide range of densities. There 
is no well-deflned accretion shock but rather a complex 
network of shocks as different lumps of infalling gas mix 
with the intracluster medium of the main halo. Numeri- 
cal simulations of this process beginning with cosmolog- 
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FIG. 11 Dimensionless entropy K/K200 a-s a function of scale 
radius r/r200 for 30 clusters simulated without radiative cooling or 
feedback. Black squares show the median profile, and the dashed 
line illustrates the power-law relation K/K200 = 1.26(r/r2oo)"'^'^ ■ 
Most of the entropy profiles shown lie close to this relation in the ra- 
dial range 0.1 < r/r200 ^ l-O- At smaller radii, the entropy profiles 
generally flatten, and their dispersion increases. This flattening is 
likely to be a real effect, as it sets in well outside the shaded box 
showing the gravitational softening length of the simulation. 



ical initial conditions produce clu sters that have nearl y 
self-similar entropy structure fe-g.- lNavarro et aTl . ll995|) . 
as expected from th e scaling pro perties of hierarchical 
structure formation iKaiseiillQSl . 

Figure ITTI shows entropy profiles of 30 clusters gener- 
ated with a numerical simulation of a ACDM cosmol- 
ogy i r icludi i ig: hydrodynamic s but not radiative cooling 
l)Kavl 12004 IVoit et all . l2004j) . The masses of these clus- 
ters span more than a factor of 10, but when their entropy 
profiles are divided by the appropriate value of 
they lie nearly on top of one another, at least outside 
the approximate core radius 0.1r2oo- This result is not 
unique to the simulation method. All codes with suf- 
ficiently high resolution find that non-radiative clusters 
have approximately self-similar entropy structure, and 
consequently self-similar density and temperature struc- 
ture (Frenk et al, 1999; Voit et al, 2004). 

The self-similarity of the entropy profile in non- 
radiative clusters is a very useful point of comparison 
for sleuthing the effects of galaxy formation. Devia- 
tions from this baseline profile are likely to be due to 
a combination of ra diative cool i ng and the feedback pro- 
cesses that ensue. I Voit et aj] ^QQ^ find that a good 
representation of the baseline entropy profile produced 
outside the cores of clusters by hierarchical structure 



formation is given by the power law ifpL = (1.35 ± 
0.2)^^200 (?'/'r20o)^'^- Specifying the baseline entropy pro- 
file within the cluster core (< 0. 1^200) is more difficult 
both because there is more dispersion in that region 
among simulated clusters and because the results there 
depend somewhat on the hydrodynamical method used 
in the simulations. 

Another notable aspect of self-similarity in non- 
radiative clusters is that the gas density profile and the 
dark-matter density profile outsid e 0.1r2nn h ave virtu- 
ally i dentical shapes ijPrenk et all . ^99; Nav arro et all . 

This feature leads to another useful approxi- 
mation to the entropy profiles of non-radiative clusters 
ijBrvanL l2000f) . One can specify the gas density pro- 
file by assuming that it obeys an NFW density profile 
(Sec. lfII.B.2ll with the same concentration as the dark 
matter and a total baryon mass /6M200 within r2oo and 
then compute the temperature profile that would keep 
the gas in hydrostatic equilibrium. The temperature 
and density profiles in this kind of model approximately 
obey the polytr opic relation T(r) oc f/o(r)]^°"~^, with 
7fiff ~ 1.1 - 1.2 l)Komatsu and SeliakL l200ll: I Voit all 
120021) . Combining them produces an alternative baseline 
entropy profile that depends on the concentration param- 
eter C200 of the dark-matter halo and that this review will 
denote as Kjq-py^{r). 

Despite the complexity of the shock structure in hi- 
erarchical accretion, the numerically simulated entropy 
profiles are simila r in shape to t hose created by smooth 
accretion models ijBorgani et oiJ . r2002a. .20011 However, 
these profiles have lower o verall entropy lev els than the 
smooth- accretion proffies ijVoit et all l2003|) . Figure IT^ 
demonstrates this point by comparing the two approx- 
imations, JsTpL and i^NFW, of simulated non-radiative 
clusters with two entropy profiles drawn from smooth 
ac cretion models, one from the numerical computations 
of iTozzi and NormanI l)200l[l and the other from equa- 
tion 1)72(1 assuming M oc t^/^ and -ffo^o = I7 which are 
reasonable assumptions for ACDM models. 

The likely reason for this discrepancy is that smooth 
accretion maximizes entropy production because it min- 
imizes the mean mass- weighted density of accreting gas 
ijPonman et all . l2003t IVoit all hOOS^ . Smoothing the 
accreting gas does not change the accretion velocity but 
does reduce the mean density of accreting gas lumps. 

Because postshock entropy scales as WacPi ^^'^j the mean 
entropy of lumpy accreted gas is therefore less than 
in the smooth-accretion case. This effect of smooth- 
ing might not be entirely academic, because the ob- 
served entropy profiles of low-temperature clusters show 
a similar entropy boost relative to the baseline profile 
ijVoit and Ponmanl l2003|l . 



4. Observed Entropy Profiles 

Astronomers have known for more than a decade 
that the structure of the intracluster medium in 
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FIG. 12 Entropy profiles from smooth accretion and hierar- 
chical accretion. Smoothing of the gas accreting onto a clus- 
ter boosts entropy production while maintaining the charac- 
teristic K{r) oc r^'^ entropy profile. The two lower lines il- 
lustrate approximate entropy profiles produced by hierarchi- 
cal accretion, including the power-law expression from Fig- 
ure 1111 and the T^nfw model described in the text, with 
C200 = 5. The two upper lines illustrate entropy profiles re- 
sulting fr om smooth accretion mod els, including profile com- 
puted bv iTozzi and NormanI ll200lD for a 10^^ h'^ Mq clus- 
ter with 300 keV cm'^ of preheating and radiative cooling im- 
plemented (dotted line) and a profile computed from equa- 
tion (I72I I and preheating amounting to 0.1_R'200- The two 
smooth models run roughly parallel to the hierarchical accre- 
tion models but their normalizations are ~ 1.5 times higher. 



real clusters cannot be self-similar because the 
luminosity-temperature relati on of clusters does not 
agree with self-sini ilar scal ing llEdge and Ste wart. 1991; 
lEvrard and Henrvl Il99ll: iKaiseii Il99ll Sec. Illl.(;.3t . 
Only within the last couple of years has the nature of 
that deviation from self-similarity become clear. High- 
quality cluster observations with the XMM-Newton satel- 
lite are showing that intraclustcr entropy profiles have 
the K{r) cx r^-^ shape characteristic of gravitational 
structure formation outside of the core, but the over- 
all normalization of these profiles sc ales as T.^.^f instead 
of Ti mn, as in the baseline profiles ijPratt and ArnaudL 
l2003j) . Analyses of much larger cluster samples observed 
with earlier X-ray telescopes have arrived at the same 
concl usion. Instead of self-similarity with K[r/r 200) oc 
Tium, iPonman et al\ |2003'1 find altered similarity with 
-f^(^/''20o) T'j^'^ at both the core radius 0.1r2oo and far- 
ther out in clusters, at the scale radius t^qq ~ 0.66r2oo- 
The question to be answered is therefore how galaxy for- 



mation and feedback manage to produce such a shift in 
the overall normalization of cluster entropy profiles with- 
out substantially changing their shape. 



B. Galaxy Formation and Feedback 

In the decade since astronomers became aware of sim- 
ilarity breaking in clusters there have been many numer- 
ical simulations devoted to understanding it. Our under- 
standing of this problem remains incomplete because in- 
cluding galaxy formation in cosmological models of clus- 
ter formation is a formidable computational challenge, 
requiring codes that simulate three-dimensional hydrody- 
namics spanning an enormous dynamical range in length 
scales and that track a large number of physical pro- 
cesses. The volume required to model a cosmologically 
significant sample of clusters is of order 10^^ cm in lin- 
ear scale, individual galaxies have sizes ~ 10^^ cm, star- 
forming regions within those galaxies can be as small as 
10"'^^ cm, and the stars themselves are only ~ 10"'^"'^ cm in 
size. Sophisticated hydrodynamical techniques are now 
able to model the formation of the first stars from cosmo- 
logical initial conditions (Abel, Bryan, & Norman 2001), 
but are far from being able to track in detail the forma- 
tion of an entire galaxy's worth of stars, let alone all the 
feedback processes that can occur. 

For the time being, the difficulty of solving this prob- 
lem from first principles means that modelers have to 
be selective about the physical processes and conditions 
that merit modeling. Important clues to what the es- 
sential processes are can be gleaned from the observed 
characteristics of clusters. This part of the review sifts 
through some of those clues, showing that radiative cool- 
ing is likely to be the process that sets the entropy scale 
of similarity breaking but that radiative cooling cannot 
act alone. Otherwise, too much baryonic matter would 
condense into stars and cold gas clouds. 



1. Preheating 

Early approaches to the problem of similarity break- 
ing in clusters postulated that some sort of heating pro- 
cess imposed a universal minimum entropy — an "entropy 
ffoor" — on the intergalactic gas before it colle cted into 
clusters l)Evrard and Henrvl . ll99ll:lKaiseilll99l|) . Impos- 
ing a global entropy floor helps to bring the theoretical 
Lx-Tinm relation into better agreement with observations 
because this extra entropy makes the gas harder to com- 
press in cluster cores, where entropy is smallest, particu- 
larly in the shallower potential wells of low-temperature 
clusters. This resistance to compression breaks cluster 
similarity by lowering the core density and therefore the 
X-ray emissivity in low-T clusters more than in high-T 
clusters, thereby steepening the ix-7ium relation. 

According to this preheating picture, the core entropy 
level and scaling relations of clusters should reflect the 
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global entropy floor produced at early times. Initial mea- 
surements of entropy at the core radius rg i demonstrated 
that low-temperature clusters had greater amounts of en- 
tropy than expected from self-similarity and suggested 
that the level of the entr o py floor was ~ 135 k eV cm^ 
llT.lovd-Davies et all 1200(11 IPonman p.t all l1999^ . This 
result matched well with numerical simulations of clus- 
ter formation with preheating levels of 50 — lOOkeVcm^ 
that produced clusters with appr oximately the right Lx- 
Tium relation l|Bialek et alU200l\i . 

However, simple preheating now appears to be 
too crude an explanation for similarity breaking. 
In the preheating picture, low-t emperature c l usters 
should have large i sentro pic cores llBalogh et all Il999t 
iTozzi and NormanI 1200 1*^ , but this prediction disagrees 
with the observations showing that the shapes of cluster 
entropy profiles do not depend significantly on tempera- 
ture (Sec. IIV.A.4I) . In addition, the abundant evidence 
for intergalactic gas at ^ 10^ K from quasar absorp- 
tion line studies clearly shows that preheating cannot be 
global at z > 2, and the preheating models themselves 
do not explain why the level of the entropy floor should 
be - 135keVcm2. 



2. Radiative Cooling 

In contrast, the observed entropy scale of similarity 
breaking emerges naturally from the process of radiative 
cooling. Intergalactic gas both inside and outside of clus- 
ters radiates thermal energy at a rate given by the cooling 
function Ac(T'), described in more detail in Sec. III.BTI 
Cooling that radiates an energy Aq per particle reduces 
the entropy by A \nK^/^ — Ag/fceT. Thus, the equation 
expressing these radiative losses can be written 
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is the entropy level at which constant-density gas at tem- 
perature T radiates an energy equivalent to its thermal 
energy in the time to. The latter formula reduces to 
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when pure bremsstrahlung cooling is assumed. 

The fact that the entropy threshold below which gas 
cools within the universe's lifetime is close to the en- 
tropy floor inferred from clusters with ^ 2 keV temper- 
atures suggests that radiative cooling s ets the e ntrop y 
scale for similarity bre aking (Voit an d BrvanL l200l[l . 
IVoit and PonmanI l(2003|) further quantify this point. Fig- 
ure El shows how entropy measurements at 0.1r2oo in a 
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FIG. 13 Comparison between entropy measured at 0.1r2oo 
and the cooling threshold in a large sample of clusters. Small 
points show the entropy Ko.i measured at 0.1r2oo in a sam- 
ple of 64 clusters, and points with error bars show the mean 
entropy measurement in temperature bins of eight clusters 
each. The dotted line gives the mean entropy predicted by 
simulations of clusters without radiative cooling of feedback. 
The solid line shows the value of the cooling threshold Kc{T) 
computed for heavy-element abundances 0.3 times their so- 
lar values and to = 14 Gyr. The dashed line shows the en- 
tropy predicted at 0.1 r2oo by the simple analytical model of 
IVoit and Br"^ J2001D . 



large sample of clusters l)Ponman et al\ . 12003") compare 
with the cooling threshold Kc{T) for gas with heavy- 
element abundances equal to 30% of their solar values 
relative to hydrogen. Both the measured core entropies 
and the entropy threshold for cooling scale as T^/'^, and 
they are approximately equal, although the scatter in the 
data is quite significant. 

Section liV.C.lI b elow shows that radiative cooling also 
accounts well for the scaling relations of global X-ray 
properties like Lx and Tiuni with mass. However, casting 
equation H75|) in dimensionless form illustrates why at 
least some feedback must compensate for cooling: 
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The cooling threshold in low-temperature clusters at the 
present time is ~20% of the characteristic entropy K200 
and greater than that if emission-line cooling from heavy 
elements is included. At earlier times, the dimension- 
less cooling threshold is even higher, meaning that a 
large proportion of the baryons belonging to the pro- 
genitor objects that ultimately assembled into present- 
day clusters should have condensed into stars or cold 
gas clouds. This is one of the manifestations of the 
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class ic overcooling problem of h ierarchical ga laxy forma- 
tion l)Blanchard et a?J . ll992tlCo"l6..1991:., White and Reed 
Because the observed mass ratio of stars to hot 
gas in clusters is only about 10% fSec. lIV-'Hll . wholesale 
baryon condensation doesn't seem to have happened. 

Recognition of th is overcooling problem led 
IVoit and BrvanI l)200l|) to propose a way for radia- 
tive cooling to determine the entropy scale of similarity 
breaking without acting alone. The basic idea is that gas 
with entropy less than Kc{T) cannot persist indefinitely. 
It must either cool and condense or be heated until its 
entropy exceeds Kc{T). At any given time, feedback is 
triggered by condensing gas parcels with entropy less 
than the cooling threshold and acts until those parcels 
are eliminated by either cooling, heating, or some 
combination of the two. Thus, the joint action of cooling 
and feedback imprint an entropy scale corresponding 
to the cooling threshold, regardless of how strong the 
feedback is. This kind of effect has now been seen in a 
number of numerical simulation s that include cooling 
and differing forms of feedback ("Borga ni et al\. l2002at 
iBorgani ei al, 2003; Dave et al, 2002; Kav et all l2003t 
l\^ldarmniLl2003() . 

The fact that similarity breaking is not very sensitive 
to the efficiency of feedback is good news for cosmologists 
but bad news for astrophysicists. It offers hope that we 
can understand the mass-observable relations of clusters 
without solving all the messy astrophysical problems of 
feedback. Yet, it also implies that the mass-observable re- 
lations alone do not tell us much about the nature of that 
feedback. Instead, we must look to the spatially resolved 
entropy profiles of clusters l'Kavl l20d3:IVoit and Pomnaiil 
12003.) and the ra tio of c ondense d baryons to hot gas 
in cluster s (Balog h all. i2 001b: B organi et a/1 l2002at 
iBorgani et aL . .20031: iKaTeii aL . i20o|r 



3. Feedback from Supernovae 

Supernovae are the most obvious candidate for supply- 
ing the feedback that suppresses condensation, but it is 
not clear that supernova heating and the galactic winds 
it drives can provide enough entropy to keep the fraction 
of condensed baryons below about 10%. Heavy-element 
abundances in clusters imply that the total amount 
of supernova energy released during a cluster's history 
amounts to ~ 0. 3 — IkeV per gas particle in the intra- 
cluste r medium l)Finoguenov et ai\ . l2001al: iPipino et al\ . 
I200I. The amount of energy input needed to explain 
the mass-obser vable relations while avoiding overcool- 
ing is ^ 1 keV ()Tornatore et all l2003t IVoit et all l2002t 
IWu et aiJ .l200l. at the upper end of the range inferred 
from heavy elements, but the transfer of supernova en- 
ergy to the intracluste r medium must be highly effi cient, 
which seems unlikely (iKravtsov and Yeoesl l200d|) . Su- 
pernova energy would have to be converted to almost 
entirely to thermal energy with very little radiated away. 

In order to avoid radiative losses, supernova heating 



must raise the entropy of the gas it heats to at least 
100 keV cm^ . An evenly distributed thermal energy input 
of order 1 keV would therefore have to go into gas signif- 
icantly less dense than 10~'^cm~'^ to avoid such losses. 
Gas near the centers of present-day clusters, not to men- 
tion the galaxies where supernovae occur, is denser than 
that, particularly at earlier times when most of the star 
formation happened. Simulations that spread supernova 
feedback evenly the refore produce too m any condensed 
baryons in clusters ijBorgani et al l l2002aj) . Artificial al- 
gorithms that target supernova feedback at gas parcels 
that would other wise cool are mor e successful at prevent- 
ing overcooling llKav et all I2003D . However, efforts to 
implement a more realistic version of targeted feedback 
in the form of galactic winds ar e still not entirely su c- 
cessful at preventing overcooling ijBorgani et all l2003(l . 

It remains to be seen whether supernova feedback alone 
can a ccount for the o bserved entropy profiles of clus- 
ters. IVoit et al\ l|2003|) and Pon man et a l ( 2003) have 
proposed that entropy input from galactic winds preced- 
ing the accretion of gas onto clusters could lead to a form 
of entropy amplification that would explain the observa- 
tions. If galactic winds are strong enough to significantly 
smooth out the lumpiness of the local intergalactic gas, 
then the mode of accretion of this gas onto clusters will be 
closer to smooth accretion than to hierarchical accretion, 
thereby boosting the entropy generated through accre- 
tion shocks without changing the profile's characteristic 
shape. This effect is a plausible explanation for the al- 
tered similarity of the observed entropy profiles, but it 
has not yet been t horoughly t ested in simulations. In- 
triguing results bv iKavl ()2004|) show that an extremely 
targeted feedback model, in which supernovae heat the 
local gas to lOOOkeVcm^, successfully reproduces both 
the normalization and shape of the observed entropy pro- 
files. 



4. Feedback from Active Galactic IMuclei 

If supernovae cannot prevent overcooling, then perhaps 
supermassive black holes in the nu c lei of galaxies are 
what stop it (ICavaliere et all 120021: IValaeeas and SillJ . 
Il999t IWu et all 12001"). The omnipresence of su- 
permassive black holes at the centers of galaxies 
(|Magorrian et all and the excellent correlation 

of their masses with t he bulge and halo proper- 
ties of the host gala xy (jFerrarese and Merrittl l2000t 
iGebhardt et all 1200 0^ strongly suggest that the growth 
of black holes in the nuclei of galaxies goes hand-in-hand 
with galaxy formation. Furthermore, the centers of many 
clusters with low-entropy gas whose cooling time is less 
than the age of the universe also contain active galactic 
nuclei that are ejecting str eams of relati vistic plasma into 
the intracluster medium (jBurnsl Il990|) . It is therefore 
plausible that supermassive black holes at the centers of 
clusters provide feedback that suppresses further cooling 
whenever condensing intracluster gas accretes onto the 
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central black hole. 

Such a feedback loop is attractive and consistent with 
the circumstantial evidence, but the precise mechanism 
of heating remains unclear. The bubbles of relativis- 
tic plasma being inflated by the active galactic nuclei 
in clusters appear not to be expanding fast enough to 
shock heat the intracluster medium because the rims 
of the bubbles are no hotter than their surroundings 
l)Fabian et a/.U2000l:lMc'Namara et all . l200(]|) . Also, if ac- 
tive galactic nuclei simply injected heat energy into the 
center of a cluster, then one would expect to see a flat 
or reversed entropy gradient in clusters with strong nu- 
clear activity, indicating that convection is carrying heat 
outward. Instead, the entropy gradients i n these clus- 
ter cores increase monot onically outward ijPavid et al\ . 
l200lHHorner et a^J.l2004l). One possib ility is that heating 
is episodic (jKaiser and Binnevl . l2003|) and that we have 
not yet found a cluster in the midst of an intense heat- 
ing episode. Another is that heating is somehow spread 
evenly throughout the clust er core in a way that main- 
tains the entropy gradient ("Briigg en and Kaiseii l2002t 
[Ruszkowski and Begclman, 2002). Yet another possibil- 
ity is that bursts of relativistic plasma drive sound waves 
into the i ntracluster medium that eventually dissipate 
into heat ijFabian et a;J . l2003() . 

Unfortunately, the none of these heating mechanisms 
have yet been tested in the context of cosmological struc- 
ture formation, so we do not know their overall impact on 
either baryon condensation or the global entropy proflles 
of clusters. Also, many aspects of the relationship be- 
tween cosmology and nuclear activity in galaxies remain 
highly uncertain. A major role for quasar feedback is 
plausible. However, the connection between the growth 
of central black holes in galaxies and galaxy formation it- 
self is not well understood, and the efficiency with which 
black holes convert accretion energy into outflows is un- 
known. 



duction does not satisfactorily balance radiative cool- 
ing. Temperature-gradient observations are inconsis- 
tent with steady-state balance between co oling and con- 
ducti o n in a number of clus ter cores l)Horner et al\ . 
12004 IVoigt and FabianL l2004j) . However, mixing of 
hot gas wi th cooler gas fa c ilitate d by intracluster 
turbulence fKim and NaravaiJ, |2003|) or AGN activity 
(^riiggcn and Kaiser, 2002) could enhance the effective- 
ness of heat conduction. 

It is possible that cooling, conduction, feedback, and 
perhaps mixing as well are all needed for a complete so- 
lution that explains the observed core temperature gra- 
dients without overcooling. Conduction that balances 
cooling in a steady state has often been dismissed on 
the grounds that it is not stable enough to preserve 
the observed temperat ure and density gradients for peri- 
ods o f order > 1 Gyr ijCowie and Binnevl Il977t iFabianl 
I1994D . Because of conduction's extreme sensitivity to 
temperature, it is difflcult for radiative cooling and con- 
duction to achieve precise thermal balance with a glob- 
ally stable temperature gradient l)Bregman and Davidl 
119881: ISokcij 2003) . On the other hand, conduction 
would have to be suppressed by at least two orders 
of magnitude for radiative cooling to produce t he ob- 
serve d gradients tBinnev and Cowig, 1981; Fabian et alV 
I1981D . Recent theoretical analyses of conduction have 
concluded t hat this le v el of suppressio n is unrealisti- 
cally high llMalvshkinl. l200l|: jMalvshki n and Kulsrudl 
1200 It iNaravan and Medvedevl 1200 H) . Combining cool- 
ing, conduction, and feedback offers a way out of this 
dilemma. Hybrid models in which conduction compen- 
sates for cooling in the outer parts of the core while 
feedback from an active galactic nucleus compensates for 
it in the inner parts ha ve had some success in repro- 
ducing the observations terighe nti and Mathewsl . l2003t 
iRuszkowski and Bcgelmair. .2002^') . 



5. Transport Processes 



C. Galaxy Formation and Cluster Observables 



Heat transport processes like thermal conduction and 
turbulent mixing may also mitigate radiative cooling be- 
cause gas that condenses sets up a temperature gradi- 
ent along which heat energy can ffow. In gas with- 
out magnetic fields, electrons conduct heat along tem- 
perature gradients giving a heat flux k^VT, w ith 



6 X 10~ 



' r^/^ erg cm 



||SDitzeil[T96l . the so- 
called Spitzer rate, valid when the scale length of the 
temperature gradient is longer than the electron mean 
free path. Clusters with central cooling times less than 
Hq^ indeed tend to have positive temperature gradi- 
ents within the central ^ 100 kpc, raising the possibility 
that heat conduction at least partially balances radia- 
tive losses. Many models for conduction in cluster cores 
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Earlier we saw that the X-ray properties of the 
self-similar clusters produced by purely gravitational 
structure formation do not agree with observations 
fSec. lIV.'X|l . Observed clusters of a given mass appear to 
be hotter than their theoretical counterparts and also less 
luminous, especially at the cool end of the cluster temper- 
ature range. Such disagreements have been worrisome to 
cosmologists who would like to understand what governs 
the cluster observables used to measure mass, but these 
problems are on their way to being solved. Both ana- 
lytical work and hydrodynamical simulations performed 
during the last several years are showing that the ob- 
served ix-Tium, Af200-Tium, and LX-M200 relations are 
natural outcomes of galaxy formation. Significant un- 
certainties remain, but the theoretical foundation for the 
mass-observable relations essential for probing cosmology 
with clusters is growing firmer. 
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FIG. 14 Luminosity-temperature relation. Poin ts show clus t er da ta from lArnaud and EvrardI ^^9^ (solid triangles) who 
avoided cluster s with cool cor es, cluster data from 'Markovitch ( 1998) (open squares) with cool cores excised, two sets of group 
data from Helsdon and Ponman (2000 ) (crosses) andX)snion d and Pomnan (2004) (solid octagons) that were not corrected for 
cool cores, and simulated clusters from lBorgani et al\ i200a) (small points). These simulations implement r adiati ve cooling and 
supernova feedback in the form of galactic winds. Lines show modified-entropy models from lVoit et al\ (|2003) with entropy 
truncated at the cooling threshold. There is a slight dependence on ag in these models because higher values of ug lead 
to dark-matter halos with more concentrated cores. Both the analytical and numerical models agree well with the data at 
ksTium ^ 2keV. Agreement is less good at lower temperatures, but the reasons for the disagreements are unclear. More 
feedback may be needed in the numerical models to suppress the luminosities, and the large scatter in the observations at 
< 1 keV may reflect a wide range in the effectiveness of feedback. 



1. Role of Cooling 

Radiative cooling turns out to the most important pro- 
cess to include. While it might seem paradoxical, allow- 
ing the intracluster medium to radiate thermal energy ac- 
tually causes its luminosity-weighted temperature to rise. 
The reason for this behavior is that cooling selectively 
removes low-entropy gas from the intraclus ter medium, 
raising the mean ent r opy of what remains (iBrvanL 120001 
iKnight and Ponmanl Il997t iPearce et all l2000(l . In non- 
radiative cluster simulations, the entropy of gas in the 
vicinity of the cluster core is below the cooling threshold 
Kc- This aspect of non-radiative models is unphysical, 
because gas with entropy less than would radiate an 
amount of energy greater than its total thermal energy 
content over the course of the simulations. When cooling 
is allowed to occur, this low-entropy core gas condenses 
out of the intracluster medium and is replaced by higher 
entropy core gas having a higher temperature, a lower 
density, and therefore a lower luminosity. 



A simple analytical model illustrates the effect of 
the cooling t hreshold on th e Lx-Tlnm and M^m- 
Tium relations ('Voi t and Brvanl l200ll: IVoit et all l2002t 
Wu and Xuc, 2002). The model assumes that the in- 
tracluster entropy distribution in the absence of galaxy 
formation would be the K^p^{Mg) distribution derived 
from the density profile of the dark matter. Because con- 
densation and feedback both act to eliminate gas below 
the cooling threshold, the model simply truncates the 
entropy distribution at i^c (720o) and discards all the gas 
with lower entropy. One can interpret this gas removal 
either as condensation or as extreme feedback that heats 
the sub-threshold gas to a much higher entropy level. 
This cooling and feedback need not occur at the center of 
the cluster. In a hierarchical cosmology, much of the low- 
entropy gas cools, condenses into galaxies, and produces 
feedback long before the cluster is finally assembled. 

Computing the hydrostatic configuration of the modi- 
fied entropy distribution in the original dark-matter po- 
tential gives Lx and Tium as a function of the mass Af2oo 
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FIG. 15 Mass-temperature relation. Large points show cluster data from iFinoguenov et al ] 1I2OOIU) (solid triangles) and 
iNevalaincn et al. (2000) (open squares), in which cluster masses were inferred from fitting polytropic beta models (see 
Se c. sec:tx). Dashed l ines illustrate the Msoo-Tlum relation measured in clustered simulated without cooling and feedback 
bv lEvrard et al\ ||1996D . whi ch clearly disagre e with the data points. The other lines show the Msoo-rium relations predicted 
by the analytical models of IVoit et al\ \200'i) . which agree much better with the data. There is a slight difference between 
models with as = 0.9 (dotted lines) and erg — 1.2 because higher values of as lead to clusters with higher halo concentrations 
that produce slightly higher temperatures. Tiny points show data for clusters simulated by Borgaiii et al. (2003) with radiative 
cooling and feedback in the form of supernova-driven galactic winds. The left-hand panel uses the actual values of Afsoo, which 
agree with the analytical models. The right-hand panel uses values of M500 inferred from fitting polytropic beta models to the 
observations, which underestimate true cluster masses, especially at low temperature, suggesting there may be a systematic 
observational bias in this method of mass measurement. 



and concentration C200 of the dark-matter halo. Fig- 
ures El a-nd El show that the resulting Lx-Tium and 
M2oo-Tium relations generally agree well with observa- 
tions but may slightly overpredict Lx for objects cooler 
than ^ 2keV and do not account for the large scatter 
at low temperatures. There are no free parameters in 
this model, other than the cosmological parameters, be- 
cause the M200-C200 relation and the age of the universe 
used to compute Kc depend only on cosmology, and the 
heavy-element abundance used to compute the cooling 
threshold is taken from observations. 

Numerical simulations in which feedback is either 
weak or non-existent produce clusters whose properties 
are quite similar to the ones in this simple analytical 
model. Early numerical investigations of c ooling in in- 
dividual clusters gave inconclusive re sults l)Lewis et all 
l200Ct ISug inohara and Ost rikeil Il998|) . but simulations 
bv iMuanw ong et al. (2001) showed that adding cooling 
to a large-scale cluster simulation could give an Lx- 
Tium like the observed one. Subsequent numerical work 



has confirmed tha t result (e . g.. iBoreani et al . 2002at 
iDave et all l2002t iKav et all l2003t IValdarnini [2003). 

Adding radiative cooling to the cosmological model pro- 
duces good agreement with observations at all cluster 
temperatures > 2 keV. 

Even when the simulations implement strong feedback, 
the X-ray scaling relat ions change remarkably little from 
the c ooling-only case (|Borgani et all l200 2a: 'Kav et gj] . 

The main effect on the ix-Tlum relation of adding 
strong feedback to simulations that already include cool- 
ing is to slightly reduce the luminosity of cool 2 keV) 
clusters, bringing them into better agreement with obser- 
vations. This insensitivity to the efficiency of feedback is 
another strong indication that the cooling threshold gov- 
erns the entropy scale for similarity breaking. 

One point of disagreement between the analytical mod- 
els, the simulations, and the observations concerns the 
central temperature gradient. Many observed clusters 
have a relatively small amount of gas in their cores whose 
cooling time is less than the age of the universe, and in 
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those clusters the core temperature gradient is generally 
positive {dT/dr > 0). In the simple analytical models 
outlined above, no gas is allowed to be below the cooling 
threshold, resulting in a core that is nearly isentropic and 
thus has a negative temperature gradient [dT/dr < 0). 
Likewise, simulations with cooling and feedback also tend 
to have flat or negative temperature gradients in the 
neighborhood of the core radius (~ 100 kpc). 

This problem deserves attention because elevated core 
temperatures in models with cooling are what bring the 
theoretical M2oo-21um relation into agreement with ob- 
servations. Making the analytical model slightly more 
realistic brings the predicted temperature gradient into 
better agreement with observations. The discontinuous 
cooling threshold applied by the simplest models is overly 
crude because it completely removes gas just below the 
threshold while gas just above the threshold does not 
cool at all. Instead, cooling acts upon the ent ropy distri- 
bution as described by equation lf75|l . IVoit et al. (2002) 
show that modifying the baseline profile if nfw using this 
equation with T = T200 for a time to leads to an entropy 
distribution that reproduces the observed temperature 
gradients. 

Simulations involving pure cooling do not agree with 
this result. The temperature-gradient discrepancy be- 
tween analytical models and simulations in the pure- 
cooling case is still not understood, but may have some- 
thing to do with the implicit stability of the cooling pro- 
cess in the analytical model. In that model the present- 
day intracluster medium is spherically symmetric with a 
positive entropy gradient, by definition, whereas thermal 
instabilities in the simulations that lead to a more het- 
erogenous entropy pattern at each radius, may be at the 
root of the negative temperature gradient. Perhaps the 
observations are telling us that a stabilizing influence like 
conduction erases small-scale thermal instabilities with- 
out shutting off global cooling. 



2. Role of Feedback 

The primary role of feedback is to regulate how many 
baryons condense into stars and cold gas clouds. As men- 
tioned in the discussion of cooling, strong feedback does 
not have a large effect on the ix-Tlum relation, aside from 
a slight decrease in the luminosity of low-temperature 
clusters, as long as it is strong enough to shut off cool- 
ing in the gas parcels that it affects. However, moderate 
feedback that heats gas to < lOOkeVcm^ can boost Lx 
because it does not allow the core gas to cool but rather 
maintains it in an entropy s tate that al l ows it to radiate 
considerable thermal energy 

Some preheating and feedback models adequately ex- 
plain the scalin g relations \ yithout explicitly includ- 
ing cooling Ce.g.. 'Bab ul et l2002HBalogh et a/.l.ll999t 
^ialck et al, 2001; To zzi and Normanl l200l[l . In these 
models, the minimum entropy level introduced by heat- 
ing is typically a free parameter that is adjusted to give 



the best-fitting -Lx-Tlum relation. The value of this best- 
fitting entropy level turns out to be 100 — 400keVcm^, 
approximately corresponding to the level of the cooling 
threshold. This correspondence is consistent with the 
idea that the amount of heating needed to explain the 
mass-observable relation is determined by the need to 
shut cooling, in which case cooling still sets the entropy 
scale of similarity brea king, even when it is not explicitly 
included in the model l|Voit et adl2002() . 

From the standpoint of the mass-observable relations, 
the most important effect of feedback itself has to do with 
cluster richness. In both the simulations and the ana- 
lytical models, pure cooling leads to a larger fraction of 
condensed baryons in cool clusters ij Borgani et a/.|. [20026 ; 
iBorgani et at, 2003: Da ve et a/.| . l2002HMuanwong et al . 
12001: Vpit et al. . .2002), implying that these objects 
might have a higher star-to-baryon ratio and therefore a 
lower mass-to-light ratio. There are some observational 
indications that the ratio of stell ar luminos ity to mass 
in clusters is a function of mass l)Lin et all 12003). but 
not all such studies agree. This issue will need to be set- 
tled in order for optical richness measurements to deliver 
high-precision mass functions fSec. IIV.DIi . 

3. Role of Smoothing 

A full understanding of the I/x-Tium relation may in- 
volve feedback indirectly, through its smoothing effects 
on the intergalactic medium (Sec. IIV.A.3|I . If the ob- 
served preservation of K{r) cx r^'^ entropy profiles is 
indeed due to smoothing of the intergalactic medium 
followed by accretion onto clusters, then the present- 
day entropy profiles of clusters are evidence that galac- 
tic winds were widespread prior to the accretion of 
gas into today's clusters. Rather than just affecting 
the core entropy of clusters, a modest amount of en- 
tropy produced by early winds may have been ampli- 
fied by smooth accretion, boosting the entire entropy 
profile by a common factor determ ined by the cooling 
threshold ijVoit and Ponmanl l2003|) . If that is indeed 
what happens, then it would explain the observed alter- 
atio n of cluster similarity such that K(r/r 200) oc T^J^_^ 
fPon man et~d\ . l2003t iPratt and Arnaudl l2003(l . which 
leads directly to the relation Lx oc rj^jjj(r20o/7ium)^'^ 
for pure bremsstrahlung emission, in agreement with the 
observations. 



4. Predictions for Evolution 

Preheating, the cooling threshold, and the altered sim- 
ilarity indicative of smoothing affect the time-dependent 
behavior of the ix-Tium relation differently, offering a 
way to gather further information about their relative 
influence on cluster structure. Defining 

/•r/raoo / „ \ 2 
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one can express the scahng of a cluster's integrated X-ray 
luminosity as 

Lx OC Ac(rium)^200Pcri (78) 

« Tl^(^Y' H{z)L , (79) 

V J lum / 

where the first line assumes the cluster is approximately 
isothermal and the the second line is an approximation 
that assumes pure bremsstrahlung emission. The self- 
similar case, 

Lx (X T,l„,H{z) (80) 

is well known to be a poor description of the data because 
its power-law slope at z « is to o shallow. 

The modified-entropy models of lVoit et al\ 1)20021) show 
that enforcing a minimum core entropy level i^min breaks 
self-similarity in such a way that L oc K^f^^T2QQ H^'^ , if 
ii'min is a significant fraction of the cluster's characteris- 
tic entropy K2oo. In the pure preheating case, JsTmin is 
assumed to be independent of both cluster mass and of 
redshift, leading to 

In other words, pure preheating steepens the Lx-Ti^m a 
little more than necessary but causes high-redshift clus- 
ters to be less luminous than one would expect from 
their temperatures because the entropy floor i^min is a 
larger proportion of K20Q earlier in time. This prediction 
appears to conflict with recent observ a,tions indicating 
evolu tion in the opposite direction fe.g.. lVikhlinin et al\ . 
|2002|) . Tying the minimum entropy scale to the cool- 
ing threshold Kc cx T^J^t'^^^ helps to solve this problem 
because it leads to 

In this case, a little bit of tilt in the Tium/Lboo relation, 
consistent with observations (see Table Pl , is needed to 
sufficiently steepen the Lx-Ti^m relation, and the sense 
of the evolution agrees with observations. In a ACDM 
universe, the redshift dependence of the luminosity nor- 
malization is H-^t-^ - (1 + z)"-^ - H^'^^ out to 
z ^ 0.5. Altered similarity linked to the cooling thresh- 
old is in better agreement with the slope but produces 
less evolution. Assuming density profiles that scale as 
Pg{r/r20Q) oc {Tum/Kcf/'^ yields 

" HHz)tHz) ■ 

The normalization of luminosity in this relation varies as 
ff-3i^2 _ (1 + 2)0-3 _ ^0.5 to 2 _ 0.5. 

Observations of evolution in the luminosity- 
temperature relation are not yet precise enough to 



distinguish between these latter two possibilities. The 
usual procedure is to compare the Lx-^lum relation 
measured in a significantly redshifted cluster sample 
to the relation measured at z ~ 0. IVikhlinin et all 
l)2002f) were the first to detect evolution, finding 
Lx{Tium) oc (1 + z)^^'^ with 5lt — 1-5 ± 0.3, assuming 
a ACDM cosmology. These authors compared the 
low-redshift sample of Markcvitch (1998) to a collection 
of 22 cluste r s in t he redshift range 0.4 < z < 0.8. 
iLumb et all l)2003)) found a similar amount of evo- 
lution, Ilt = l-52lo'27, using a smaller sample of 
eight clusters at z « 0.4, but not all studies find such 
strong evolution, which exceeds the predi ctions of the 
basic models outlined above. For example, lEttori et all 
(|2003h find bLT = 0.62 ± 0.28 for a sample of 28 
clusters at z > 0.4 using the Markevitch (1998) sample 
as the low-red shift baseline and b r .T = 0.98 ± 0.20 
relative to the 'Ar naud and EvrardI l)l999f) low-redshift 
baseline. Furthermore, the strength of the evolution 
found by lEttori et all l)2003|) becomes smaller for high- 
redshift clusters, consistent with no evolution at all 
{bLT ~ 0.04 ± 0.33) when they include only their 16 
clusters with z > 0.6 in the comparison with the Marke- 
vitch sample. Apparently, there are some systematic 
uncertainties in these evolution measurements that need 
to be accounted for. 



D. Constraints on Baryon Condensation 

The ultimate test for feedback models is that they must 
account for both the proportion of condensed baryons to 
hot gas in clusters and any dependence of that propor- 
tion on cluster mass. In order to apply that test, we 
would like to have firm numbers for the amount of con- 
densed baryons in clusters, but such measurements can 
be difficult. Even if the amount of starlight were per- 
fectly measured, converting integrated starlight to stellar 
mass involves uncertain assumptions about both the star- 
formation history of a cluster and the distribution func- 
tion of stellar masses at birth, a quantity known as the 
initial mass function. Any variation in the star-formation 
history or initial mass function with cluster mass can lead 
to spurious systematic trends in the cluster mass function 
inferred from cluster richness. 

Baryons contained in cold clouds are even harder to 
constrain because gaseous matter in thi s form can be 
nearl y invis ible, if it is sufficiently cold ijFerland et all . 
11994 I2OOI. However, it seems unlikely that large 
amounts of baryons exist in such a form, at least in rich 
clusters. Adding the amount of baryons inferred from 
starlight to the amount of hot gas observed in rich clus- 
ters accounts for nearly all the baryons expected from 
the global ratio of baryons to dark matter, leaving little 
room left in the baryon budget for cold gas clouds. 

The situation is less clear in lower-mass clusters and 
groups of galaxies, in which the proportion of hot gas to 
dark matter is significantly smaller. Summing the masses 
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of stars and hot gas accounts for only about half the ex- 
pected number of baryons in some cases, yet there is 
no ob servational ev i dence for large quantities of cold gas 
(e.g., IWaugh et all 1200 2^. Circumstantial evidence ar- 
gues against there being large reservoirs of cold baryons 
in groups. Presumably, the rich clusters in which we now 
see virtually all the baryons were hierarchically assem- 
bled from objects like the baryon-poor groups of galaxies 
we observe today. If large amounts of baryons in their 
higher-redshift counterparts were locked away in some 
cold, condensed form, then how were they released when 
these groups of galaxies merged to form large clusters? 

A more complete accounting of intracluster baryons, 
especially in low-mass systems, is sorely needed in order 
test the various feedback models described in Sec. IIV.BI 
The rest of this section summarizes some of the recent 
work on constraining the amount of condensed intraclus- 
ter baryons in the form of stars, the prospects for mea- 
suring baryon condensation through the S-Z effect, and 
X-ray observations of nearby clusters that may help solve 
the puzzles surrounding condensation and feedback. 



1. Mass and Light in Clusters 

Inferences of stellar mass from the observed starlight 
are generally based on a mass-to-light ratio expressed 
in solar units. That is, the mass-to-light ratio of the 
Sun in all wavebands equals unity. Because young stel- 
lar populations tend to emit large amounts of blue light 
that quickly dies out as the population ages, most re- 
cent assessments of the stellar mass in clusters have con- 
centrated on measurements of infrared starlight in the 
K band at roughly 2 microns. Observing starlight in 
this band minimizes the uncertainties owing to a clus- 
ter's star formation history. The old stellar populations 
characteristic of elliptical galaxies tend to have a if -band 
mass-to-light ratio Tk ~ 0.8 h^Q, and mass-to-light ra- 
tios in spiral an d irregular galaxie s can be up to a factor 
of two smaller llBell and de JondlgO OO ) . For the mix of 
galaxies seen in clusters. iLin et al\ (|2003i) estimate that 
the mean mass-to-light ratio ranges from Tk = 0.7 h^^ 
to 0.8 as cluster temperature climbs from 2 keV to 
10 keV. From this mass-to-light ratio, they infer that the 
fraction of intracluster ba ryons in stellar form is /, w 0.1 
for rich clusters (see also Balo gh et all l2001bl) . Notice 
that this value is about half that predicted by current 
simulations of cluster formation including strong feed- 
back, a discrepancy that could bec ome even larger with 
higher-resolution simulations (Bor gani et a/.ll2003(l . 

Many studies, but not all of them, suggest that the 
fraction of condensed baryons in stars may be a func- 
tion of cluster mass. The ratio of JC- band light to to - 
tal cluster mass within rsoo found by iLin et ai\ l|200g " 



at 10^^ hjQ,MQ. Similar trend s with shallower slopes 
are se en at other wavelengths. iBahcall and Comerfordl 
l|2002j) find the ratio of total mass to starlight in the 



heart of the visible spec trum (y-band ) is Ty oc T^^^^'^ . 
In blue light (B-band), ICirardi et all l|200^ find Ts oc 
^,^0.25 However, other studies have f ound no signifi- 
cant dependence on mass. According tol Kochanek et all 
(2003) the ii'-band mass-to-light ratio inside r2oo scales 
as T K M. 



-0.10±0.09 
200 



is Tk = (47 ± 3) /i7o (M500/3 X 1O"/itoM0)°-3i, which 



n0.5±0.1 



translates to a temperature dependence Tk oc T^^^ 
The ratio of stellar mass to total mass in this study 
therefore ranges from ~2.2% at lO" ft^ to -1.2% 



2. Intergalactic Stars 

Measurements of the total stellar luminosity in clus- 
ters generally focus on the light from galaxies, but what 
about stars that are not in galaxies? At least some of 
a cluster's stars flo at unmoored in the sp aces between 
a cluster's galaxies (iFerguson et a/.l . flOOSl) . These stars 
arc thought to have originated in galaxies but were later 
stripped from their homes by tidal forces during a close 
encounter with another galaxy. Current observational 
limits, however, indicate that no more than 10-20% of 
a clu ster's stars are outside of galaxies (Dur rell et q^, 
|2002() . implying that failing to account for intergalactic 
stars does not lead to large errors in measured mass-to- 
light ratios. 



3. Global S-Z Effect 

If the baryons missing in low-mass clusters are not in 
condensed form, then they must be in the form of hot gas 
beyond the regions detectable with X-ray telescopes. If 
that is indeed the case, then the best way of finding them 
may be through the Sunyaev-Zeldovich effect. Section 
III. C. II showed that the integrated microwave distortion 
from a cluster scales with the electron temperature of 
the cluster and the overall mass in hot electrons. If a sig- 
nificant proportion of baryons have condensed, then the 
associated electrons are also locked away in cold clouds, 
where they don't contribute to the S-Z signal. 

Simulations of cluster formation that include cooling 
indicate how the mean value of the y-distortion owing 
to clusters depends on cooling and feedback processes. 
Models bv lda Silva et al\ l)200l|) produce y = 3.2 x 10"*^ 
in the non-radiative case, dropping to y = 2.3 x 10^^ 
in the case of radiative cooling without feedback. The 
difference between the radiative case and non-radiative 
case is somewhat sm aller when feedback is implemented. 
IWhite et al\ l)2002|) find y = 2.5 x 10"^ in the non- 
radiative case and y = 2.1 x 10^^ when both cooling 
and feedback are turned on. Testing for baryon conden- 
sation in this way may eventually be possible, but the 
mean value of the S-Z distortion is also very sensitive 
to other cosmological parameters, such as erg, which will 
have to be very well constrained before we can use the 
global y parameter to test feedback models. 
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4. Cooling Flows in Clusters 

Cores of present-day clusters are among the best 
places in the universe to observe the interplay be- 
tween condensation and feedback. Gas at the cen- 
ters of many clusters can radiate an amount of en- 
ergy equal to its thermal energy in less than a bil- 
lion years, yet the majority o f that gas is not con- 
densing (see iDonahiie and Voiil I2004L for a recent re- 
view). Early interpretations of clusters with central cool- 
ing times less than the age of the universe suggested that 
the core gas should gradually condense and be replaced 
by t he surround ing mat erial in an orderly flo w of cooling 
gas llCowie and Binne^ , 
iMathews and Bregman , 



1977t [Fabian and N ulsenL Il977 
1978|) . The mass condensation 
rates inferred from X-ray imaging ranged as high as ~ 10^ 
to 10'^ Mq yr~^ implying that the cores of these "cooling- 
flow" clusters should contain ^ 10^^ Mq in the form of 
condensed baryons. However, exhaustive searches for 
this mass sink genera lly have not found stars forming 
at such a high rate ('McNam ara and O'Connel Il989t 
IP'ConncU and McNamara, 198^ nor have they found 
sufficiently large collections of cold baryonic clouds to ac- 



count for the deposited mas s llBraine and Duprap. 



McNamara and Jaffel Il994l: lO'Dea et all Il994 



Voit and Donahuelll995f^ 



1994; 



1998; 



Now X-ray spectroscopy itself is showing that con- 
densation proceeds at a considerably slower rate, if it 
happens at all. The central gas in clusters with short 
cooling times appears to reach temperatures ~ Tium/2, 
but very li ttle X-rav line emissi o n is s een from gas at 
< rium/3 l|Peterson et all. \200A l200l|) . Some sort of 
heating mechanism seems to be inhibiting condensation 
below this temperature. There are plenty of candidates 
for resupplying the radiated heat energy — supernovae, 
outflows from active galactic nuclei, electron thermal con- 
duction, and turbulent mixing have all been suggested 
(see Sec. lIV.Bjl — but there is still no consensus on the 
relative importance of these mechanisms. 

A reduced amount of condensation still appears to be 
occurring. For example, plenty of circumstantial evi- 
dence links short central cooling times with star forma- 
tion at the centers of clusters. Objects whose central 
cooling time is less than the age of the universe frequently 
contain emission-line nebulae whose properties suggest 
that they are energized primarily by hot, young stars 
l)Johnstone et a/.l . ll987HVoit and Donahuel Il997,) . Neb- 
ulae like these are never seen in clusters where the central 
cooling time is greater than the universe's age ijHu et all 
^985). Also, objects with prominent nebulae tend to have 
abundant cool molecu lar hydrogen gas, the seed mate- 
rial for star formation llDonahue et al 1 1200(1: lEdgeLl2nnH 
lEdge and FraveIll2003^ . Efforts to estimate the star for- 
mation rate from the ultraviolet light emanating from the 
centers of clusters indicate that it may be consistent with 
the current upper limits on the condensation rate drawn 
from X-ray spectroscopy l)McNamara et all\200i) . 

An understanding of what regulates condensation and 



star formation at the centers of present-day clusters will 
help to solve more than just the overcooling problem of 
galaxy formation. It is also relevant to an aspect of bright 
galaxies that remains difficult to understand. The lu- 
minosity distribution function of galaxies cuts off very 
sharply at the high-luminosity end, far more sharply than 
called for in standard models of galaxy formation. Ex- 
tremely powerful feedback can produce a sharp cutoff, 
but the amount of energy input required seems to impli- 
cate a ctive galactic n uclei a s the prim ary feedback sourc e 
(e.g., iBenson et all l2003t IScannapi eco and OhL |2004|). 
Alternatively, thermal conduction might produce a sharp 
cutoff because its eff i ciency rises so rapid l y with temper- 
ature l)Benson et ad 120031: iF^bian etallhoQ'Sl . As the 
halo of a massive galaxy grows and its characteristic tem- 
perature rises through a critical threshold 10^ K, con- 
duction can strongly suppress further cooling and star 
formation, if it is not inhibited by magnetic fields. De- 
tailed studies of cluster cores will be needed to test these 
possibilities. Early efforts are indicating that conduc- 
tion might not be e fficient e nough to prev ent overcooling 
l|Dolag 120041: tjubclgas era/.Ll2004|l . 



V. CONCLUDING REMARKS 

The next decade of research into cluster evolution 
promises to be very exciting. Large optical surveys like 
the Sloan Digital Sky Survey are greatly increasing the 
number of well-studied clusters of galaxies in the low- 
redshift universe. Deep surveys looking for the Sunyaev- 
Zcldovich effect will be finding thousands of clusters to 
distances well beyond a redshift of z = 1. The Chandra 
and XMM-Newton X-ray observatories are providing our 
most detailed look yet at the intracluster medium, its 
thermodynamical state, and some of the feedback pro- 
cesses that regulate condensation of intergalactic gas into 
galaxies and stars. Also, dedicated X-ray satellite mis- 
sions to survey a large fraction of the sky for distant 
clusters are currently being planned. 

Making the most of these opportunities will require 
cooperation between observers in those different wave- 
bands, and theoretical modeling that closely links those 
cluster observables to cosmological parameters. Optical 
and infrared foUowup of S-Z surveys will be critical in or- 
der to determine the redshifts of the cluster candidates. 
X-ray foUowup of a subset of the S-Z clusters will also be 
necessary to establish how the thermodynamics of galaxy 
formation affects evolution of the mass-observable rela- 
tions in the microwave band. Concentrated efforts to 
observe a calibration set of clusters in all of these wave- 
bands will be very valuable in helping to establish how 
well the various observables trace mass and the scatter 
in each of these observables at a given mass. 

If the ACDM concordance model is indeed a good de- 
scription of the overall architecture of the universe and 
its initial perturbation spectrum, then the parameters 
describing the cosmological context in which galaxy for- 
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mation happens ought to be quite precisely estabhshed 
within this decade. Studies of cluster evolution will be 
just one part of this overall effort, which also includes 
distance determinations to high-redshift supernovae, in- 
creasingly sensitive observations of the cosmic microwave 
background, and the mapping of large-scale structure. 
However, consistency between the dynamics of cluster 
evolution and the geometry of the universe, as mea- 
sured with supernova and microwave observations, will 
stand as a particularly critical test of the overall model. 
With success, most of the remaining secrets about galaxy 
formation-- other than what dark matter and dark en- 
ergy actually are — will concern baryons and their com- 
plex cooling and feedback processes. 

Our understanding of what baryons do is rapidly pro- 
gressing, thanks in large part to large-scale cosmological 
simulations on massively parallel computers. Clusters 
and their evolution place unique constraints on those 
models because clusters are the only places in the uni- 
verse where the majority of the baryons emit detectable 
radiation, revealing their thermodynamic state and ele- 
mental abundances. Galaxy formation has clearly left its 
mark in the intracluster medium, but we are just begin- 
ning to decipher what it has written there in the gases 
between the galaxies. Perhaps in ten more years there 
will be as much optimism about understanding the bary- 
onic side of galaxy formation as there is now about the 
understanding the darker side. 
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